Search Results: "hec"

3 March 2024

Paul Wise: FLOSS Activities Feb 2024

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review
  • Spam: reported 1 Debian bug report
  • Debian BTS usertags: changes for the month

Administration
  • Debian BTS: unarchive/reopen/triage bugs for reintroduced packages: ovito, tahoe-lafs, tpm2-tss-engine
  • Debian wiki: produce HTML dump for a user, unblock IP addresses, approve accounts

Communication
  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors The SWH work was sponsored. All other work was done on a volunteer basis.

2 March 2024

Ravi Dwivedi: Malaysia Trip

Last month, I had a trip to Malaysia and Thailand. I stayed for six days in each of the countries. The selection of these countries was due to both of them granting visa-free entry to Indian tourists for some time window. This post covers the Malaysia part and Thailand part will be covered in the next post. If you want to travel to any of these countries in the visa-free time period, I have written all the questions asked during immigration and at airports during this trip here which might be of help. I mostly stayed in Kuala Lumpur and went to places around it. Although before the trip, I planned to visit Ipoh and Cameron Highlands too, but could not cover it during the trip. I found planning a trip to Malaysia a little difficult. The country is divided into two main islands - Peninsular Malaysia and Borneo. Then there are more islands - Langkawi, Penang island, Perhentian and Redang Islands. Reaching those islands seemed a little difficult to plan and I wish to visit more places in my next Malaysia trip. My first day hostel was booked in Chinatown part of Kuala Lumpur, near Pasar Seni LRT station. As soon as I checked-in and entered my room, I met another Indian named Fletcher, and after that we accompanied each other in the trip. That day, we went to Muzium Negara and Little India. I realized that if you know the right places to buy what you want, Malaysia could be quite cheap. Malaysian currency is Malaysian Ringgit (MYR). 1 MYR is equal to 18 INR. For 2 MYR, you can get a good masala tea in Little India and it costs like 4-5 MYR for a masala dosa. The vegetarian food has good availability in Kuala Lumpur, thanks to the Tamil community. I also tried Mee Goreng, which was vegetarian, and I found it fine in terms of taste. When I checked about Mee Goreng on Wikipedia, I found out that it is unique to Indian immigrants in Malaysia (and neighboring countries) but you don t get it in India!
Mee Goreng, a dish made of noodles in Malaysia.
For the next day, Fletcher had planned a trip to Genting Highlands and pre booked everything. I also planned to join him but when we went to KL Sentral to take the bus, his bus tickets were sold out. I could take a bus at a different time, but decided to visit some other place for the day and cover Genting Highlands later. At the ticket counter, I met a family from Delhi and they wanted to go to Genting Highlands but due to not getting bus tickets for that day, they decided to buy a ticket for the next day and instead planned for Batu Caves that day. I joined them and went to Batu Caves. After returning from Batu Caves, we went our separate ways. I went back and took rest at my hostel and later went to Petronas Towers at night. Petronas Towers is the icon of Kuala Lumpur. Having a photo there was a must. I was at Petronas Towers at around 9 PM. Around that time, Fletcher came back from Genting Highlands and we planned to meet at KL Sentral to head for dinner.
Me at Petronas Towers.
We went back to the same place as the day before where I had Mee Goreng. This time we had dosa and a masala tea. Their masala tea from the last day was tasty and that s why I was looking for them in the first place. We also met a Malaysian family having Indian ancestry dining there and had a nice conversation. Then we went to a place to eat roti canai in Pasar Seni market. Roti canai is a popular non-vegetarian dish in Malaysia but I took the vegetarian version.
Photo with Malaysians.
The next day, we went to Berjaya Time Square shopping place which sells pretty cheap items for daily use and souveniers too. However, I bought souveniers from Petaling Street, which is in Chinatown. At night, we explored Bukit Bintang, which is the heart of Kuala Lumpur and is famous for its nightlife. After that, Fletcher went to Bangkok and I was in Malaysia for two more days. Next day, I went to Genting Highlands and took the cable car, which had awesome views. I came back to Kuala Lumpur by the night. The remaining day I just roamed around in Bukit Bintang. Then I took a flight for Bangkok on 7th Feb, which I will cover in the next post. In Malaysia, I met so many people from different countries - apart from people from Indian subcontinent, I met Syrians, Indonesians (Malaysia seems to be a popular destination for Indonesian tourists) and Burmese people. Meeting people from other cultures is an integral part of travel for me. My expenses for Food + Accommodation + Travel added to 10,000 INR for a week in Malaysia, while flight costs were: 13,000 INR (Delhi to Kuala Lumpur) + 10,000 INR (Kuala Lumpur to Bangkok) + 12,000 INR (Bangkok to Delhi). For OpenStreetMap users, good news is Kuala Lumpur is fairly well-mapped on OpenStreetMap.

Tips
  • I bought local SIM from a shop at KL Sentral station complex which had news in their name (I forgot the exact name and there are two shops having news in their name) and it was the cheapest option I could find. The SIM was 10 MYR for 5 GB data for a week. If you want to make calls too, then you need to spend extra 5 MYR.
  • 7-Eleven and KK Mart convenience stores are everywhere in the city and they are open all the time (24 hours a day). If you are a vegetarian, you can at least get some bread and cheese from there to eat.
  • A lot of people know English (and many - Indians, Pakistanis, Nepalis - know Hindi) in Kuala Lumpur, so I had no language problems most of the time.
  • For shopping on budget, you can go to Petaling Street, Berjaya Time Square or Bukit Bintang. In particular, there is a shop named I Love KL Gifts in Bukit Bintang which had very good prices. just near the metro/monorail stattion. Check out location of the shop on OpenStreetMap.

28 February 2024

Dirk Eddelbuettel: RcppEigen 0.3.4.0.0 on CRAN: New Upstream, At Last

We are thrilled to share that RcppEigen has now upgraded to Eigen release 3.4.0! The new release 0.3.4.0.0 arrived on CRAN earlier today, and has been shipped to Debian as well. Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms. This update has been in the works for a full two and a half years! It all started with a PR #102 by Yixuan bringing the package-local changes for R integration forward to usptream release 3.4.0. We opened issue #103 to steer possible changes from reverse-dependency checking through. Lo and behold, this just stalled because a few substantial changes were needed and not coming. But after a long wait, and like a bolt out of a perfectly blue sky, Andrew revived it in January with a reverse depends run of his own along with a set of PRs. That was the push that was needed, and I steered it along with a number of reverse dependency checks, and occassional emails to maintainers. We managed to bring it down to only three packages having a hickup, and all three had received PRs thanks to Andrew and even merged them. So the plan became to release today following a final fourteen day window. And CRAN was convinced by our arguments that we followed due process. So there it is! Big big thanks to all who helped it along, especially Yixuan and Andrew but also Mikael who updated another patch set he had prepared for the previous release series. The complete NEWS file entry follows.

Changes in RcppEigen version 0.3.4.0.0 (2024-02-28)
  • The Eigen version has been upgrade to release 3.4.0 (Yixuan)
  • Extensive reverse-dependency checks ensure only three out of over 400 packages at CRAN are affected; PRs and patches helped other packages
  • The long-running branch also contains substantial contributions from Mikael Jagan (for the lme4 interface) and Andrew Johnson (revdep PRs)

Courtesy of CRANberries, there is also a diffstat report for the most recent release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

25 February 2024

Jacob Adams: AAC and Debian

Currently, in a default installation of Debian with the GNOME desktop, Bluetooth headphones that require the AAC codec1 cannot be used. As the Debian wiki outlines, using the AAC codec over Bluetooth, while technically supported by PipeWire, is explicitly disabled in Debian at this time. This is because the fdk-aac library needed to enable this support is currently in the non-free component of the repository, meaning that PipeWire, which is in the main component, cannot depend on it.

How to Fix it Yourself If what you, like me, need is simply for Bluetooth Audio to work with AAC in Debian s default desktop environment2, then you ll need to rebuild the pipewire package to include the AAC codec. While the current version in Debian main has been built with AAC deliberately disabled, it is trivial to enable if you can install a version of the fdk-aac library. I preface this with the usual caveats when it comes to patent and licensing controversies. I am not a lawyer, building this package and/or using it could get you into legal trouble. These instructions have only been tested on an up-to-date copy of Debian 12.
  1. Install pipewire s build dependencies
    sudo apt install build-essential devscripts
    sudo apt build-dep pipewire
    
  2. Install libfdk-aac-dev
    sudo apt install libfdk-aac-dev
    
    If the above doesn t work you ll likely need to enable non-free and try again
    sudo sed -i 's/main/main non-free/g' /etc/apt/sources.list
    sudo apt update
    
    Alternatively, if you wish to ensure you are maximally license-compliant and patent un-infringing3, you can instead build fdk-aac-free which includes only those components of AAC that are known to be patent-free3. This is what should eventually end up in Debian to resolve this problem (see below).
    sudo apt install git-buildpackage
    mkdir fdk-aac-source
    cd fdk-aac-source
    git clone https://salsa.debian.org/multimedia-team/fdk-aac
    cd fdk-aac
    gbp buildpackage
    sudo dpkg -i ../libfdk-aac2_*deb ../libfdk-aac-dev_*deb
    
  3. Get the pipewire source code
    mkdir pipewire-source
    cd pipewire-source
    apt source pipewire
    
    This will create a bunch of files within the pipewire-source directory, but you ll only need the pipewire-<version> folder, this contains all the files you ll need to build the package, with all the debian-specific patches already applied. Note that you don t want to run the apt source command as root, as it will then create files that your regular user cannot edit.
  4. Fix the dependencies and build options To fix up the build scripts to use the fdk-aac library, you need to save the following as pipewire-source/aac.patch
    --- debian/control.orig
    +++ debian/control
    @@ -40,8 +40,8 @@
                 modemmanager-dev,
                 pkg-config,
                 python3-docutils,
    -               systemd [linux-any]
    -Build-Conflicts: libfdk-aac-dev
    +               systemd [linux-any],
    +               libfdk-aac-dev
     Standards-Version: 4.6.2
     Vcs-Browser: https://salsa.debian.org/utopia-team/pipewire
     Vcs-Git: https://salsa.debian.org/utopia-team/pipewire.git
    --- debian/rules.orig
    +++ debian/rules
    @@ -37,7 +37,7 @@
     		-Dauto_features=enabled \
     		-Davahi=enabled \
     		-Dbluez5-backend-native-mm=enabled \
    -		-Dbluez5-codec-aac=disabled \
    +		-Dbluez5-codec-aac=enabled \
     		-Dbluez5-codec-aptx=enabled \
     		-Dbluez5-codec-lc3=enabled \
     		-Dbluez5-codec-lc3plus=disabled \
    
    Then you ll need to run patch from within the pipewire-<version> folder created by apt source:
    patch -p0 < ../aac.patch
    
  5. Build pipewire
    cd pipewire-*
    debuild
    
    Note that you will likely see an error from debsign at the end of this process, this is harmless, you simply don t have a GPG key set up to sign your newly-built package4. Packages don t need to be signed to be installed, and debsign uses a somewhat non-standard signing process that dpkg does not check anyway.
  1. Install libspa-0.2-bluetooth
    sudo dpkg -i libspa-0.2-bluetooth_*.deb
    
  2. Restart PipeWire and/or Reboot
    sudo reboot
    
    Theoretically there s a set of services to restart here that would get pipewire to pick up the new library, probably just pipewire itself. But it s just as easy to restart and ensure everything is using the correct library.

Why This is a slightly unusual situation, as the fdk-aac library is licensed under what even the GNU project acknowledges is a free software license. However, this license explicitly informs the user that they need to acquire a patent license to use this software5:
3. NO PATENT LICENSE NO EXPRESS OR IMPLIED LICENSES TO ANY PATENT CLAIMS, including without limitation the patents of Fraunhofer, ARE GRANTED BY THIS SOFTWARE LICENSE. Fraunhofer provides no warranty of patent non-infringement with respect to this software. You may use this FDK AAC Codec software or modifications thereto only for purposes that are authorized by appropriate patent licenses.
To quote the GNU project:
Because of this, and because the license author is a known patent aggressor, we encourage you to be careful about using or redistributing software under this license: you should first consider whether the licensor might aim to lure you into patent infringement.
AAC is covered by a number of patents, which expire at some point in the 2030s6. As such the current version of the library is potentially legally dubious to ship with any other software, as it could be considered patent-infringing3.

Fedora s solution Since 2017, Fedora has included a modified version of the library as fdk-aac-free, see the announcement and the bugzilla bug requesting review. This version of the library includes only the AAC LC profile, which is believed to be entirely patent-free3. Based on this, there is an open bug report in Debian requesting that the fdk-aac package be moved to the main component and that the pipwire package be updated to build against it.

The Debian NEW queue To resolve these bugs, a version of fdk-aac-free has been uploaded to Debian by Jeremy Bicha. However, to make it into Debian proper, it must first pass through the ftpmaster s NEW queue. The current version of fdk-aac-free has been in the NEW queue since July 2023. Based on conversations in some of the bugs above, it s been there since at least 20227. I hope this helps anyone stuck with AAC to get their hardware working for them while we wait for the package to eventually make it through the NEW queue. Discuss on Hacker News
  1. Such as, for example, any Apple AirPods, which only support AAC AFAICT.
  2. Which, as of Debian 12 is GNOME 3 under Wayland with PipeWire.
  3. I m not a lawyer, I don t know what kinds of infringement might or might not be possible here, do your own research, etc. 2 3 4
  4. And if you DO have a key setup with debsign you almost certainly don t need these instructions.
  5. This was originally phrased as explicitly does not grant any patent rights. It was pointed out on Hacker News that this is not exactly what it says, as it also includes a specific note that you ll need to acquire your own patent license. I ve now quoted the relevant section of the license for clarity.
  6. Wikipedia claims the base patents expire in 2031, with the extensions expiring in 2038, but its source for these claims is some guy s spreadsheet in a forum. The same discussion also brings up Wikipedia s claim and casts some doubt on it, so I m not entirely sure what s correct here, but I didn t feel like doing a patent deep-dive today. If someone can provide a clear answer that would be much appreciated.
  7. According to Jeremy B cha: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1021370#17

24 February 2024

Niels Thykier: Language Server for Debian: Spellchecking

This is my third update on writing a language server for Debian packaging files, which aims at providing a better developer experience for Debian packagers. Lets go over what have done since the last report.
Semantic token support I have added support for what the Language Server Protocol (LSP) call semantic tokens. These are used to provide the editor insights into tokens of interest for users. Allegedly, this is what editors would use for syntax highlighting as well. Unfortunately, eglot (emacs) does not support semantic tokens, so I was not able to test this. There is a 3-year old PR for supporting with the last update being ~3 month basically saying "Please sign the Copyright Assignment". I pinged the GitHub issue in the hopes it will get unstuck. For good measure, I also checked if I could try it via neovim. Before installing, I read the neovim docs, which helpfully listed the features supported. Sadly, I did not spot semantic tokens among those and parked from there. That was a bit of a bummer, but I left the feature in for now. If you have an LSP capable editor that supports semantic tokens, let me know how it works for you! :)
Spellchecking Finally, I implemented something Otto was missing! :) This stared with Paul Wise reminding me that there were Python binding for the hunspell spellchecker. This enabled me to get started with a quick prototype that spellchecked the Description fields in debian/control. I also added spellchecking of comments while I was add it. The spellchecker runs with the standard en_US dictionary from hunspell-en-us, which does not have a lot of technical terms in it. Much less any of the Debian specific slang. I spend considerable time providing a "built-in" wordlist for technical and Debian specific slang to overcome this. I also made a "wordlist" for known Debian people that the spellchecker did not recognise. Said wordlist is fairly short as a proof of concept, and I fully expect it to be community maintained if the language server becomes a success. My second problem was performance. As I had suspected that spellchecking was not the fastest thing in the world. Therefore, I added a very small language server for the debian/changelog, which only supports spellchecking the textual part. Even for a small changelog of a 1000 lines, the spellchecking takes about 5 seconds, which confirmed my suspicion. With every change you do, the existing diagnostics hangs around for 5 seconds before being updated. Notably, in emacs, it seems that diagnostics gets translated into an absolute character offset, so all diagnostics after the change gets misplaced for every character you type. Now, there is little I could do to speed up hunspell. But I can, as always, cheat. The way diagnostics work in the LSP is that the server listens to a set of notifications like "document opened" or "document changed". In a response to that, the LSP can start its diagnostics scanning of the document and eventually publish all the diagnostics to the editor. The spec is quite clear that the server owns the diagnostics and the diagnostics are sent as a "notification" (that is, fire-and-forgot). Accordingly, there is nothing that prevents the server from publishing diagnostics multiple times for a single trigger. The only requirement is that the server publishes the accumulated diagnostics in every publish (that is, no delta updating). Leveraging this, I had the language server for debian/changelog scan the document and publish once for approximately every 25 typos (diagnostics) spotted. This means you quickly get your first result and that clears the obsolete diagnostics. Thereafter, you get frequent updates to the remainder of the document if you do not perform any further changes. That is, up to a predefined max of typos, so we do not overload the client for longer changelogs. If you do any changes, it resets and starts over. The only bit missing was dealing with concurrency. By default, a pygls language server is single threaded. It is not great if the language server hangs for 5 seconds everytime you type anything. Fortunately, pygls has builtin support for asyncio and threaded handlers. For now, I did an async handler that await after each line and setup some manual detection to stop an obsolete diagnostics run. This means the server will fairly quickly abandon an obsolete run. Also, as a side-effect of working on the spellchecking, I fixed multiple typos in the changelog of debputy. :)
Follow up on the "What next?" from my previous update In my previous update, I mentioned I had to finish up my python-debian changes to support getting the location of a token in a deb822 file. That was done, the MR is now filed, and is pending review. Hopefully, it will be merged and uploaded soon. :) I also submitted my proposal for a different way of handling relationship substvars to debian-devel. So far, it seems to have received only positive feedback. I hope it stays that way and we will have this feature soon. Guillem proposed to move some of this into dpkg, which might delay my plans a bit. However, it might be for the better in the long run, so I will wait a bit to see what happens on that front. :) As noted above, I managed to add debian/changelog as a support format for the language server. Even if it only does spellchecking and trimming of trailing newlines on save, it technically is a new format and therefore cross that item off my list. :D Unfortunately, I did not manage to write a linter variant that does not involve using an LSP-capable editor. So that is still pending. Instead, I submitted an MR against elpa-dpkg-dev-el to have it recognize all the fields that the debian/control LSP knows about at this time to offset the lack of semantic token support in eglot.
From here... My sprinting on this topic will soon come to an end, so I have to a bit more careful now with what tasks I open! I think I will narrow my focus to providing a batch linting interface. Ideally, with an auto-fix for some of the more mechanical issues, where this is little doubt about the answer. Additionally, I think the spellchecking will need a bit more maturing. My current code still trips on naming patterns that are "clearly" verbatim or code references like things written in CamelCase or SCREAMING_SNAKE_CASE. That gets annoying really quickly. It also trips on a lot of commands like dpkg-gencontrol, but that is harder to fix since it could have been a real word. I think those will have to be fixed people using quotes around the commands. Maybe the most popular ones will end up in the wordlist. Beyond that, I will play it by ear if I have any time left. :)

21 February 2024

Niels Thykier: Expanding on the Language Server (LSP) support for debian/control

I have spent some more time on improving my language server for debian/control. Today, I managed to provide the following features:
  • The X- style prefixes for field names are now understood and handled. This means the language server now considers XC-Package-Type the same as Package-Type.

  • More diagnostics:

    • Fields without values now trigger an error marker
    • Duplicated fields now trigger an error marker
    • Fields used in the wrong paragraph now trigger an error marker
    • Typos in field names or values now trigger a warning marker. For field names, X- style prefixes are stripped before typo detection is done.
    • The value of the Section field is now validated against a dataset of known sections and trigger a warning marker if not known.
  • The "on-save trim end of line whitespace" now works. I had a logic bug in the server side code that made it submit "no change" edits to the editor.

  • The language server now provides "hover" documentation for field names. There is a small screenshot of this below. Sadly, emacs does not support markdown or, if it does, it does not announce the support for markdown. For now, all the documentation is always in markdown format and the language server will tag it as either markdown or plaintext depending on the announced support.

  • The language server now provides quick fixes for some of the more trivial problems such as deprecated fields or typos of fields and values.

  • Added more known fields including the XS-Autobuild field for non-free packages along with a link to the relevant devref section in its hover doc.

This covers basically all my known omissions from last update except spellchecking of the Description field. An image of emacs showing documentation for the Provides field from the language server.
Spellchecking Personally, I feel spellchecking would be a very welcome addition to the current feature set. However, reviewing my options, it seems that most of the spellchecking python libraries out there are not packaged for Debian, or at least not other the name I assumed they would be. The alternative is to pipe the spellchecking to another program like aspell list. I did not test this fully, but aspell list does seem to do some input buffering that I cannot easily default (at least not in the shell). Though, either way, the logic for this will not be trivial and aspell list does not seem to include the corrections either. So best case, you would get typo markers but no suggestions for what you should have typed. Not ideal. Additionally, I am also concerned with the performance for this feature. For d/control, it will be a trivial matter in practice. However, I would be reusing this for d/changelog which is 99% free text with plenty of room for typos. For a regular linter, some slowness is acceptable as it is basically a batch tool. However, for a language server, this potentially translates into latency for your edits and that gets annoying. While it is definitely on my long term todo list, I am a bit afraid that it can easily become a time sink. Admittedly, this does annoy me, because I wanted to cross off at least one of Otto's requested features soon.
On wrap-and-sort support The other obvious request from Otto would be to automate wrap-and-sort formatting. Here, the problem is that "we" in Debian do not agree on the one true formatting of debian/control. In fact, I am fairly certain we do not even agree on whether we should all use wrap-and-sort. This implies we need a style configuration. However, if we have a style configuration per person, then you get style "ping-pong" for packages where the co-maintainers do not all have the same style configuration. Additionally, it is very likely that you are a member of multiple packaging teams or groups that all have their own unique style. Ergo, only having a personal config file is doomed to fail. The only "sane" option here that I can think of is to have or support "per package" style configuration. Something that would be committed to git, so the tooling would automatically pick up the configuration. Obviously, that is not fun for large packaging teams where you have to maintain one file per package if you want a consistent style across all packages. But it beats "style ping-pong" any day of the week. Note that I am perfectly open to having a personal configuration file as a fallback for when the "per package" configuration file is absent. The second problem is the question of which format to use and what to name this file. Since file formats and naming has never been controversial at all, this will obviously be the easy part of this problem. But the file should be parsable by both wrap-and-sort and the language server, so you get the same result regardless of which tool you use. If we do not ensure this, then we still have the style ping-pong problem as people use different tools. This also seems like time sink with no end. So, what next then...?
What next? On the language server front, I will have a look at its support for providing semantic hints to the editors that might be used for syntax highlighting. While I think most common Debian editors have built syntax highlighting already, I would like this language server to stand on its own. I would like us to be in a situation where we do not have implement yet another editor extension for Debian packaging files. At least not for editors that support the LSP spec. On a different front, I have an idea for how we go about relationship related substvars. It is not directly related to this language server, except I got triggered by the language server "missing" a diagnostic for reminding people to add the magic Depends: $ misc:Depends [, $ shlibs:Depends ] boilerplate. The magic boilerplate that you have to write even though we really should just fix this at a tooling level instead. Energy permitting, I will formulate a proposal for that and send it to debian-devel. Beyond that, I think I might start adding support for another file. I also need to wrap up my python-debian branch, so I can get the position support into the Debian soon, which would remove one papercut for using this language server. Finally, it might be interesting to see if I can extract a "batch-linter" version of the diagnostics and related quickfix features. If nothing else, the "linter" variant would enable many of you to get a "mini-Lintian" without having to do a package build first.

20 February 2024

Niels Thykier: Language Server (LSP) support for debian/control

About a month ago, Otto Kek l inen asked for editor extensions for debian related files on the debian-devel mailing list. In that thread, I concluded that what we were missing was a "Language Server" (LSP) for our packaging files. Last week, I started a prototype for such a LSP for the debian/control file as a starting point based on the pygls library. The initial prototype worked and I could do very basic diagnostics plus completion suggestion for field names.
Current features I got 4 basic features implemented, though I have only been able to test two of them in emacs.
  • Diagnostics or linting of basic issues.
  • Completion suggestions for all known field names that I could think of and values for some fields.
  • Folding ranges (untested). This feature enables the editor to "fold" multiple lines. It is often used with multi-line comments and that is the feature currently supported.
  • On save, trim trailing whitespace at the end of lines (untested). Might not be registered correctly on the server end.
Despite its very limited feature set, I feel editing debian/control in emacs is now a much more pleasant experience. Coming back to the features that Otto requested, the above covers a grand total of zero. Sorry, Otto. It is not you, it is me.
Completion suggestions For completion, all known fields are completed. Place the cursor at the start of the line or in a partially written out field name and trigger the completion in your editor. In my case, I can type R-R-R and trigger the completion and the editor will automatically replace it with Rules-Requires-Root as the only applicable match. Your milage may vary since I delegate most of the filtering to the editor, meaning the editor has the final say about whether your input matches anything. The only filtering done on the server side is that the server prunes out fields already used in the paragraph, so you are not presented with the option to repeat an already used field, which would be an error. Admittedly, not an error the language server detects at the moment, but other tools will. When completing field, if the field only has one non-default value such as Essential which can be either no (the default, but you should not use it) or yes, then the completion suggestion will complete the field along with its value. This is mostly only applicable for "yes/no" fields such as Essential and Protected. But it does also trigger for Package-Type at the moment. As for completing values, here the language server can complete the value for simple fields such as "yes/no" fields, Multi-Arch, Package-Type and Priority. I intend to add support for Section as well - maybe also Architecture.
Diagnostics On the diagnostic front, I have added multiple diagnostics:
  • An error marker for syntax errors.
  • An error marker for missing a mandatory field like Package or Architecture. This also includes Standards-Version, which is admittedly mandatory by policy rather than tooling falling part.
  • An error marker for adding Multi-Arch: same to an Architecture: all package.
  • Error marker for providing an unknown value to a field with a set of known values. As an example, writing foo in Multi-Arch would trigger this one.
  • Warning marker for using deprecated fields such as DM-Upload-Allowed, or when setting a field to its default value for fields like Essential. The latter rule only applies to selected fields and notably Multi-Arch: no does not trigger a warning.
  • Info level marker if a field like Priority duplicates the value of the Source paragraph.
Notable omission at this time:
  • No errors are raised if a field does not have a value.
  • No errors are raised if a field is duplicated inside a paragraph.
  • No errors are used if a field is used in the wrong paragraph.
  • No spellchecking of the Description field.
  • No understanding that Foo and X[CBS]-Foo are related. As an example, XC-Package-Type is completely ignored despite being the old name for Package-Type.
  • Quick fixes to solve these problems... :)
Trying it out If you want to try, it is sadly a bit more involved due to things not being uploaded or merged yet. Also, be advised that I will regularly rebase my git branches as I revise the code. The setup:
  • Build and install the deb of the main branch of pygls from https://salsa.debian.org/debian/pygls The package is in NEW and hopefully this step will soon just be a regular apt install.
  • Build and install the deb of the rts-locatable branch of my python-debian fork from https://salsa.debian.org/nthykier/python-debian There is a draft MR of it as well on the main repo.
  • Build and install the deb of the lsp-support branch of debputy from https://salsa.debian.org/debian/debputy
  • Configure your editor to run debputy lsp debian/control as the language server for debian/control. This is depends on your editor. I figured out how to do it for emacs (see below). I also found a guide for neovim at https://neovim.io/doc/user/lsp. Note that debputy can be run from any directory here. The debian/control is a reference to the file format and not a concrete file in this case.
Obviously, the setup should get easier over time. The first three bullet points should eventually get resolved by merges and upload meaning you end up with an apt install command instead of them. For the editor part, I would obviously love it if we can add snippets for editors to make the automatically pick up the language server when the relevant file is installed.
Using the debputy LSP in emacs The guide I found so far relies on eglot. The guide below assumes you have the elpa-dpkg-dev-el package installed for the debian-control-mode. Though it should be a trivially matter to replace debian-control-mode with a different mode if you use a different mode for your debian/control file. In your emacs init file (such as ~/.emacs or ~/.emacs.d/init.el), you add the follow blob.
(with-eval-after-load 'eglot
    (add-to-list 'eglot-server-programs
        '(debian-control-mode . ("debputy" "lsp" "debian/control"))))
Once you open the debian/control file in emacs, you can type M-x eglot to activate the language server. Not sure why that manual step is needed and if someone knows how to automate it such that eglot activates automatically on opening debian/control, please let me know. For testing completions, I often have to manually activate them (with C-M-i or M-x complete-symbol). Though, it is a bit unclear to me whether this is an emacs setting that I have not toggled or something I need to do on the language server side.
From here As next steps, I will probably look into fixing some of the "known missing" items under diagnostics. The quick fix would be a considerable improvement to assisting users. In the not so distant future, I will probably start to look at supporting other files such as debian/changelog or look into supporting configuration, so I can cover formatting features like wrap-and-sort. I am also very much open to how we can provide integrations for this feature into editors by default. I will probably create a separate binary package for specifically this feature that pulls all relevant dependencies that would be able to provide editor integrations as well.

Niels Thykier: Language Server (LSP) support for debian/control

Work done:
  • [X] No errors are raised if a field does not have a value.
  • [X] No errors are raised if a field is duplicated inside a paragraph.
  • [X] No errors are used if a field is used in the wrong paragraph.
  • [ ] No spellchecking of the Description field.
  • [X] No understanding that Foo and X[CBS]-Foo are related. As an example, XC-Package-Type is completely ignored despite being the old name for Package-Type.
  • [X] Fixed the on-save trim end of line whitespace. Bug in the server end.
  • [X] Hover text for field names

13 February 2024

Matthew Palmer: Not all TLDs are Created Equal

In light of the recent cancellation of the queer.af domain registration by the Taliban, the fragile and difficult nature of country-code top-level domains (ccTLDs) has once again been comprehensively demonstrated. Since many people may not be aware of the risks, I thought I d give a solid explainer of the whole situation, and explain why you should, in general, not have anything to do with domains which are registered under ccTLDs.

Top-level What-Now? A top-level domain (TLD) is the last part of a domain name (the collection of words, separated by periods, after the https:// in your web browser s location bar). It s the com in example.com, or the af in queer.af. There are two kinds of TLDs: country-code TLDs (ccTLDs) and generic TLDs (gTLDs). Despite all being TLDs, they re very different beasts under the hood.

What s the Difference? Generic TLDs are what most organisations and individuals register their domains under: old-school technobabble like com , net , or org , historical oddities like gov , and the new-fangled world of words like tech , social , and bank . These gTLDs are all regulated under a set of rules created and administered by ICANN (the Internet Corporation for Assigned Names and Numbers ), which try to ensure that things aren t a complete wild-west, limiting things like price hikes (well, sometimes, anyway), and providing means for disputes over names1. Country-code TLDs, in contrast, are all two letters long2, and are given out to countries to do with as they please. While ICANN kinda-sorta has something to do with ccTLDs (in the sense that it makes them exist on the Internet), it has no authority to control how a ccTLD is managed. If a country decides to raise prices by 100x, or cancel all registrations that were made on the 12th of the month, there s nothing anyone can do about it. If that sounds bad, that s because it is. Also, it s not a theoretical problem the Taliban deciding to asssert its bigotry over the little corner of the Internet namespace it has taken control of is far from the first time that ccTLDs have caused grief.

Shifting Sands The queer.af cancellation is interesting because, at the time the domain was reportedly registered, 2018, Afghanistan had what one might describe as, at least, a different political climate. Since then, of course, things have changed, and the new bosses have decided to get a bit more active. Those running queer.af seem to have seen the writing on the wall, and were planning on moving to another, less fraught, domain, but hadn t completed that move when the Taliban came knocking.

The Curious Case of Brexit When the United Kingdom decided to leave the European Union, it fell foul of the EU s rules for the registration of domains under the eu ccTLD3. To register (and maintain) a domain name ending in .eu, you have to be a resident of the EU. When the UK ceased to be part of the EU, residents of the UK were no longer EU residents. Cue much unhappiness, wailing, and gnashing of teeth when this was pointed out to Britons. Some decided to give up their domains, and move to other parts of the Internet, while others managed to hold onto them by various legal sleight-of-hand (like having an EU company maintain the registration on their behalf). In any event, all very unpleasant for everyone involved.

Geopolitics on the Internet?!? After Russia invaded Ukraine in February 2022, the Ukranian Vice Prime Minister asked ICANN to suspend ccTLDs associated with Russia. While ICANN said that it wasn t going to do that, because it wouldn t do anything useful, some domain registrars (the companies you pay to register domain names) ceased to deal in Russian ccTLDs, and some websites restricted links to domains with Russian ccTLDs. Whether or not you agree with the sort of activism implied by these actions, the fact remains that even the actions of a government that aren t directly related to the Internet can have grave consequences for your domain name if it s registered under a ccTLD. I don t think any gTLD operator will be invading a neighbouring country any time soon.

Money, Money, Money, Must Be Funny When you register a domain name, you pay a registration fee to a registrar, who does administrative gubbins and causes you to be able to control the domain name in the DNS. However, you don t own that domain name4 you re only renting it. When the registration period comes to an end, you have to renew the domain name, or you ll cease to be able to control it. Given that a domain name is typically your brand or identity online, the chances are you d prefer to keep it over time, because moving to a new domain name is a massive pain, having to tell all your customers or users that now you re somewhere else, plus having to accept the risk of someone registering the domain name you used to have and capturing your traffic it s all a gigantic hassle. For gTLDs, ICANN has various rules around price increases and bait-and-switch pricing that tries to keep a lid on the worst excesses of registries. While there are any number of reasonable criticisms of the rules, and the Internet community has to stay on their toes to keep ICANN from totally succumbing to regulatory capture, at least in the gTLD space there s some degree of control over price gouging. On the other hand, ccTLDs have no effective controls over their pricing. For example, in 2008 the Seychelles increased the price of .sc domain names from US$25 to US$75. No reason, no warning, just pay up .

Who Is Even Getting That Money? A closely related concern about ccTLDs is that some of the cool ones are assigned to countries that are not great. The poster child for this is almost certainly Libya, which has the ccTLD ly . While Libya was being run by a terrorist-supporting extremist, companies thought it was a great idea to have domain names that ended in .ly. These domain registrations weren t (and aren t) cheap, and it s hard to imagine that at least some of that money wasn t going to benefit the Gaddafi regime. Similarly, the British Indian Ocean Territory, which has the io ccTLD, was created in a colonialist piece of chicanery that expelled thousands of native Chagossians from Diego Garcia. Money from the registration of .io domains doesn t go to the (former) residents of the Chagos islands, instead it gets paid to the UK government. Again, I m not trying to suggest that all gTLD operators are wonderful people, but it s not particularly likely that the direct beneficiaries of the operation of a gTLD stole an island chain and evicted the residents.

Are ccTLDs Ever Useful? The answer to that question is an unqualified maybe . I certainly don t think it s a good idea to register a domain under a ccTLD for vanity purposes: because it makes a word, is the same as a file extension you like, or because it looks cool. Those ccTLDs that clearly represent and are associated with a particular country are more likely to be OK, because there is less impetus for the registry to try a naked cash grab. Unfortunately, ccTLD registries have a disconcerting habit of changing their minds on whether they serve their geographic locality, such as when auDA decided to declare an open season in the .au namespace some years ago. Essentially, while a ccTLD may have geographic connotations now, there s not a lot of guarantee that they won t fall victim to scope creep in the future. Finally, it might be somewhat safer to register under a ccTLD if you live in the location involved. At least then you might have a better idea of whether your domain is likely to get pulled out from underneath you. Unfortunately, as the .eu example shows, living somewhere today is no guarantee you ll still be living there tomorrow, even if you don t move house. In short, I d suggest sticking to gTLDs. They re at least lower risk than ccTLDs.

+1, Helpful If you ve found this post informative, why not buy me a refreshing beverage? My typing fingers (both of them) thank you in advance for your generosity.

Footnotes
  1. don t make the mistake of thinking that I approve of ICANN or how it operates; it s an omnishambles of poor governance and incomprehensible decision-making.
  2. corresponding roughly, though not precisely (because everything has to be complicated, because humans are complicated), to the entries in the ISO standard for Codes for the representation of names of countries and their subdivisions , ISO 3166.
  3. yes, the EU is not a country; it s part of the roughly, though not precisely caveat mentioned previously.
  4. despite what domain registrars try very hard to imply, without falling foul of deceptive advertising regulations.

12 February 2024

Gunnar Wolf: Heads up! A miniDebConf is approaching in Santa Fe, Argentina

I realize it s a bit late to start publicly organizing this, but better late than never I m happy some Debian people I have directly contacted have already expressed interest. So, lets make this public! For all interested people who are reasonably close to central Argentina, or can be persuaded to come here in a month s time You are all welcome! It seems I managed to convince my good friend Mart n Bayo (some Debian people will remember him, as he was present in DebConf19 in Curitiba, Brazil) to get some facilities for us to have a nice Debian get-together in Central Argentina.

Where? We will meet at APUL Asociaci n de Personal no-docente de la Universidad Nacional del Litoral, in downtown Santa Fe, Argentina.

When? Saturday, 2024.03.09. It is quite likely we can get some spaces for continuing over Sunday if there is demand.

What are we planning? We have little time for planning but we want to have a space for Debian-related outreach (so, please think about a topic or two you d like to share with general free software-interested, not too technical, audience). Please tell me by mail (gwolf@debian.org) about any ideas you might have. We also want to have a general hacklab-style area to hang out, work a bit in our projects, and spend a good time together.

Logistics I have briefly commented about this with our dear and always mighty DPL, and Debian will support Debian-related people interested in attending; please check personally with me for specifics on how to handle this case by case. My intention is to cover costs for travel, accomodation (one or two nights) and food for whoever is interested in coming over.

More information As I don t want to direct people to keep an eye on my blog post for updates, I ll copy this information (and keep it updated!) at the Debian Wiki / DebianEvents / ar / 2024 / MiniDebConf / Santa Fe please refer to that page!

Contact

Codes of Conduct DebConf and Debian Code of Conduct apply. See the DebConf Code of Conduct and the Debian Code of Conduct.

Registration Registration is free, but needed. See the separate Registration page.

Talks Please, send your proposal to gwolf@debian.org

8 February 2024

Dirk Eddelbuettel: RcppArmadillo 0.12.8.0.0 on CRAN: New Upstream, Interface Polish

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1119 other packages on CRAN, downloaded 32.5 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 575 times according to Google Scholar. This release brings a new (stable) upstream (minor) release Armadillo 12.8.0 prepared by Conrad two days ago. We, as usual, prepared a release candidate which we tested against the over 1100 CRAN packages using RcppArmadillo. This found no issues, which was confirmed by CRAN once we uploaded and so it arrived as a new release today in a fully automated fashion. We also made a small change that had been prepared by GitHub issue #400: a few internal header files that were cluttering the top-level of the include directory have been moved to internal directories. The standard header is of course unaffected, and the set of full / light / lighter / lightest headers (matching we did a while back in Rcpp) also continue to work as one expects. This change was also tested in a full reverse-dependency check in January but had not been released to CRAN yet. The set of changes since the last CRAN release follows.

Changes in RcppArmadillo version 0.12.8.0.0 (2024-02-06)
  • Upgraded to Armadillo release 12.8.0 (Cortisol Injector)
    • Faster detection of symmetric expressions by pinv() and rank()
    • Expanded shift() to handle sparse matrices
    • Expanded conv_to for more flexible conversions between sparse and dense matrices
    • Added cbrt()
    • More compact representation of integers when saving matrices in CSV format
  • Five non-user facing top-level include files have been removed (#432 closing #400 and building on #395 and #396)

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

7 February 2024

Reproducible Builds: Reproducible Builds in January 2024

Welcome to the January 2024 report from the Reproducible Builds project. In these reports we outline the most important things that we have been up to over the past month. If you are interested in contributing to the project, please visit our Contribute page on our website.

How we executed a critical supply chain attack on PyTorch John Stawinski and Adnan Khan published a lengthy blog post detailing how they executed a supply-chain attack against PyTorch, a popular machine learning platform used by titans like Google, Meta, Boeing, and Lockheed Martin :
Our exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to [Amazon Web Services], potentially add code to the main repository branch, backdoor PyTorch dependencies the list goes on. In short, it was bad. Quite bad.
The attack pivoted on PyTorch s use of self-hosted runners as well as submitting a pull request to address a trivial typo in the project s README file to gain access to repository secrets and API keys that could subsequently be used for malicious purposes.

New Arch Linux forensic filesystem tool On our mailing list this month, long-time Reproducible Builds developer kpcyrd announced a new tool designed to forensically analyse Arch Linux filesystem images. Called archlinux-userland-fs-cmp, the tool is supposed to be used from a rescue image (any Linux) with an Arch install mounted to, [for example], /mnt. Crucially, however, at no point is any file from the mounted filesystem eval d or otherwise executed. Parsers are written in a memory safe language. More information about the tool can be found on their announcement message, as well as on the tool s homepage. A GIF of the tool in action is also available.

Issues with our SOURCE_DATE_EPOCH code? Chris Lamb started a thread on our mailing list summarising some potential problems with the source code snippet the Reproducible Builds project has been using to parse the SOURCE_DATE_EPOCH environment variable:
I m not 100% sure who originally wrote this code, but it was probably sometime in the ~2015 era, and it must be in a huge number of codebases by now. Anyway, Alejandro Colomar was working on the shadow security tool and pinged me regarding some potential issues with the code. You can see this conversation here.
Chris ends his message with a request that those with intimate or low-level knowledge of time_t, C types, overflows and the various parsing libraries in the C standard library (etc.) contribute with further info.

Distribution updates In Debian this month, Roland Clobus posted another detailed update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) . Additionally 7 of the 8 bookworm images from the official download link build reproducibly at any later time. In addition to this, three reviews of Debian packages were added, 17 were updated and 15 were removed this month adding to our knowledge about identified issues. Elsewhere, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Community updates There were made a number of improvements to our website, including Bernhard M. Wiedemann fixing a number of typos of the term nondeterministic . [ ] and Jan Zerebecki adding a substantial and highly welcome section to our page about SOURCE_DATE_EPOCH to document its interaction with distribution rebuilds. [ ].
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 254 and 255 to Debian but focusing on triaging and/or merging code from other contributors. This included adding support for comparing eXtensible ARchive (.XAR/.PKG) files courtesy of Seth Michael Larson [ ][ ], as well considerable work from Vekhir in order to fix compatibility between various and subtle incompatible versions of the progressbar libraries in Python [ ][ ][ ][ ]. Thanks!

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In January, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Reduce the number of arm64 architecture workers from 24 to 16. [ ]
    • Use diffoscope from the Debian release being tested again. [ ]
    • Improve the handling when killing unwanted processes [ ][ ][ ] and be more verbose about it, too [ ].
    • Don t mark a job as failed if process marked as to-be-killed is already gone. [ ]
    • Display the architecture of builds that have been running for more than 48 hours. [ ]
    • Reboot arm64 nodes when they hit an OOM (out of memory) state. [ ]
  • Package rescheduling changes:
    • Reduce IRC notifications to 1 when rescheduling due to package status changes. [ ]
    • Correctly set SUDO_USER when rescheduling packages. [ ]
    • Automatically reschedule packages regressing to FTBFS (build failure) or FTBR (build success, but unreproducible). [ ]
  • OpenWrt-related changes:
    • Install the python3-dev and python3-pyelftools packages as they are now needed for the sunxi target. [ ][ ]
    • Also install the libpam0g-dev which is needed by some OpenWrt hardware targets. [ ]
  • Misc:
    • As it s January, set the real_year variable to 2024 [ ] and bump various copyright years as well [ ].
    • Fix a large (!) number of spelling mistakes in various scripts. [ ][ ][ ]
    • Prevent Squid and Systemd processes from being killed by the kernel s OOM killer. [ ]
    • Install the iptables tool everywhere, else our custom rc.local script fails. [ ]
    • Cleanup the /srv/workspace/pbuilder directory on boot. [ ]
    • Automatically restart Squid if it fails. [ ]
    • Limit the execution of chroot-installation jobs to a maximum of 4 concurrent runs. [ ][ ]
Significant amounts of node maintenance was performed by Holger Levsen (eg. [ ][ ][ ][ ][ ][ ][ ] etc.) and Vagrant Cascadian (eg. [ ][ ][ ][ ][ ][ ][ ][ ]). Indeed, Vagrant Cascadian handled an extended power outage for the network running the Debian armhf architecture test infrastructure. This provided the incentive to replace the UPS batteries and consolidate infrastructure to reduce future UPS load. [ ] Elsewhere in our infrastructure, however, Holger Levsen also adjusted the email configuration for @reproducible-builds.org to deal with a new SMTP email attack. [ ]

Upstream patches The Reproducible Builds project tries to detects, dissects and fix as many (currently) unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Separate to this, Vagrant Cascadian followed up with the relevant maintainers when reproducibility fixes were not included in newly-uploaded versions of the mm-common package in Debian this was quickly fixed, however. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

2 February 2024

Ben Hutchings: FOSS activity in January 2024

Scarlett Gately Moore: Some exciting news! Kubuntu: I m back!!!

It s official, the Kubuntu Council has hired me part time to work on the 24.04 LTS release, preparation for Plasma 6, and to bring life back into the Distribution. First I want thank the Kubuntu Council for this opportunity and I plan a long and successful journey together!!!! My first week ( I started midweek ): It has been a busy one! Many meet and greets with the team and other interested parties. I had the chance to chat with Mike from Kubuntu Focus and I have to say I am absolutely amazed with the work they have done, and if you are in the market for a new laptop, you must check these out!!! https://kfocus.org Or if you want to try before you buy you can download the OS! All they ask is for an e-mail, which is completely reasonable. Hosting isn t free! Besides, you can opt out anytime and they don t share it with anyone. I look forward to working closely with this project. We now have a Kubuntu Team in KDE invent https://invent.kde.org/teams/distribution-kubuntu if you would like to join us, please don t hesitate to ask! I have started a new Wiki and our first page is the ever important Bug triaging! It is still a WIP but you can check it out here: https://invent.kde.org/teams/distribution-kubuntu/docs/-/wikis/Bug-Triage-Story-WIP , with that said I have started the launchpad work to make tracking our bugs easier buy subscribing kubuntu-bugs to all our packages and creating proper projects for our packages missing them. We have compiled a list of our various documentation links that need updated and Rick Timmis is updating kubuntu.org! Aaron Honeycutt has been busy with the Kubuntu Manual https://github.com/kubuntu-team/kubuntu-manual which is in good shape. We just need to improve our developer story  I have been working on the rather massive Apparmor bug https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/2046844 with testing the fixes from the ppa and writing profiles for the various KDE packages affected ( pretty much anything that uses webengine ) and making progress there. My next order of business staging Frameworks 5.114 with guidance from our super awesome Rik Mills that has been doing most of the heavy lifting in Kubuntu for many years now. So thank you for that Rik  I will also start on our big transition to the Calamaras Installer! I do have experience here, so I expect it will be a smooth one. I am so excited for the future of Kubuntu and the exciting things to come! With that said, the Kubuntu funding is community donation driven. There is enough to pay me part time for a couple contracts, but it will run out and a full-time contract would be super awesome. I am reaching out to anyone enjoying Kubuntu and want to help with the future of Kubuntu to please consider a donation! We are working on more donation options, but for now you can donate through paypal at https://kubuntu.org/donate/ Thank you!!!!!

30 January 2024

Antoine Beaupr : router archeology: the Soekris net5001

Roadkiller was a Soekris net5501 router I used as my main gateway between 2010 and 2016 (for r seau and t l phone). It was upgraded to FreeBSD 8.4-p12 (2014-06-06) and pkgng. It was retired in favor of octavia around 2016. Roughly 10 years later (2024-01-24), I found it in a drawer and, to my surprised, it booted. After wrangling with a RS-232 USB adapter, a null modem cable, and bit rates, I even logged in:
comBIOS ver. 1.33  20070103  Copyright (C) 2000-2007 Soekris Engineering.
net5501
0512 Mbyte Memory                        CPU Geode LX 500 Mhz 
Pri Mas  WDC WD800VE-00HDT0              LBA Xlt 1024-255-63  78 Gbyte
Slot   Vend Dev  ClassRev Cmd  Stat CL LT HT  Base1    Base2   Int 
-------------------------------------------------------------------
0:01:2 1022 2082 10100000 0006 0220 08 00 00 A0000000 00000000 10
0:06:0 1106 3053 02000096 0117 0210 08 40 00 0000E101 A0004000 11
0:07:0 1106 3053 02000096 0117 0210 08 40 00 0000E201 A0004100 05
0:08:0 1106 3053 02000096 0117 0210 08 40 00 0000E301 A0004200 09
0:09:0 1106 3053 02000096 0117 0210 08 40 00 0000E401 A0004300 12
0:20:0 1022 2090 06010003 0009 02A0 08 40 80 00006001 00006101 
0:20:2 1022 209A 01018001 0005 02A0 08 00 00 00000000 00000000 
0:21:0 1022 2094 0C031002 0006 0230 08 00 80 A0005000 00000000 15
0:21:1 1022 2095 0C032002 0006 0230 08 00 00 A0006000 00000000 15
 4 Seconds to automatic boot.   Press Ctrl-P for entering Monitor.
 
                                            
                                                  ______
                                                    ____  __ ___  ___ 
            Welcome to FreeBSD!                     __   '__/ _ \/ _ \
                                                    __       __/  __/
                                                                      
    1. Boot FreeBSD [default]                     _     _   \___ \___ 
    2. Boot FreeBSD with ACPI enabled             ____   _____ _____
    3. Boot FreeBSD in Safe Mode                    _ \ / ____   __ \
    4. Boot FreeBSD in single user mode             _)   (___         
    5. Boot FreeBSD with verbose logging            _ < \___ \        
    6. Escape to loader prompt                      _)  ____)    __   
    7. Reboot                                                         
                                                  ____/ _____/ _____/
                                            
                                            
                                            
    Select option, [Enter] for default      
    or [Space] to pause timer  5            
  
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.4-RELEASE-p12 #5: Fri Jun  6 02:43:23 EDT 2014
    root@roadkiller.anarc.at:/usr/obj/usr/src/sys/ROADKILL i386
gcc version 4.2.2 20070831 prerelease [FreeBSD]
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Geode(TM) Integrated Processor by AMD PCS (499.90-MHz 586-class CPU)
  Origin = "AuthenticAMD"  Id = 0x5a2  Family = 5  Model = a  Stepping = 2
  Features=0x88a93d<FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CLFLUSH,MMX>
  AMD Features=0xc0400000<MMX+,3DNow!+,3DNow!>
real memory  = 536870912 (512 MB)
avail memory = 506445824 (482 MB)
kbd1 at kbdmux0
K6-family MTRR support enabled (2 registers)
ACPI Error: A valid RSDP was not found (20101013/tbxfroot-309)
ACPI: Table initialisation failed: AE_NOT_FOUND
ACPI: Try disabling either ACPI or apic support.
cryptosoft0: <software crypto> on motherboard
pcib0 pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
Geode LX: Soekris net5501 comBIOS ver. 1.33 20070103 Copyright (C) 2000-2007
pci0: <encrypt/decrypt, entertainment crypto> at device 1.2 (no driver attached)
vr0: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe100-0xe1ff mem 0xa0004000-0xa00040ff irq 11 at device 6.0 on pci0
vr0: Quirks: 0x2
vr0: Revision: 0x96
miibus0: <MII bus> on vr0
ukphy0: <Generic IEEE 802.3u media interface> PHY 1 on miibus0
ukphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr0: Ethernet address: 00:00:24:cc:93:44
vr0: [ITHREAD]
vr1: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe200-0xe2ff mem 0xa0004100-0xa00041ff irq 5 at device 7.0 on pci0
vr1: Quirks: 0x2
vr1: Revision: 0x96
miibus1: <MII bus> on vr1
ukphy1: <Generic IEEE 802.3u media interface> PHY 1 on miibus1
ukphy1:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr1: Ethernet address: 00:00:24:cc:93:45
vr1: [ITHREAD]
vr2: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe300-0xe3ff mem 0xa0004200-0xa00042ff irq 9 at device 8.0 on pci0
vr2: Quirks: 0x2
vr2: Revision: 0x96
miibus2: <MII bus> on vr2
ukphy2: <Generic IEEE 802.3u media interface> PHY 1 on miibus2
ukphy2:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr2: Ethernet address: 00:00:24:cc:93:46
vr2: [ITHREAD]
vr3: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe400-0xe4ff mem 0xa0004300-0xa00043ff irq 12 at device 9.0 on pci0
vr3: Quirks: 0x2
vr3: Revision: 0x96
miibus3: <MII bus> on vr3
ukphy3: <Generic IEEE 802.3u media interface> PHY 1 on miibus3
ukphy3:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr3: Ethernet address: 00:00:24:cc:93:47
vr3: [ITHREAD]
isab0: <PCI-ISA bridge> at device 20.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <AMD CS5536 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe000-0xe00f at device 20.2 on pci0
ata0: <ATA channel> at channel 0 on atapci0
ata0: [ITHREAD]
ata1: <ATA channel> at channel 1 on atapci0
ata1: [ITHREAD]
ohci0: <OHCI (generic) USB controller> mem 0xa0005000-0xa0005fff irq 15 at device 21.0 on pci0
ohci0: [ITHREAD]
usbus0 on ohci0
ehci0: <AMD CS5536 (Geode) USB 2.0 controller> mem 0xa0006000-0xa0006fff irq 15 at device 21.1 on pci0
ehci0: [ITHREAD]
usbus1: EHCI version 1.0
usbus1 on ehci0
cpu0 on motherboard
pmtimer0 on isa0
orm0: <ISA Option ROM> at iomem 0xc8000-0xd27ff pnpid ORM0000 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
atrtc0: <AT Real Time Clock> at port 0x70 irq 8 on isa0
ppc0: parallel port not found.
uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
uart0: [FILTER]
uart0: console (19200,n,8,1)
uart1: <16550 or compatible> at port 0x2f8-0x2ff irq 3 on isa0
uart1: [FILTER]
Timecounter "TSC" frequency 499903982 Hz quality 800
Timecounters tick every 1.000 msec
IPsec: Initialized Security Association Processing.
usbus0: 12Mbps Full Speed USB v1.0
usbus1: 480Mbps High Speed USB v2.0
ad0: 76319MB <WDC WD800VE-00HDT0 09.07D09> at ata0-master UDMA100 
ugen0.1: <AMD> at usbus0
uhub0: <AMD OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ugen1.1: <AMD> at usbus1
uhub1: <AMD EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
GEOM: ad0s1: geometry does not match label (255h,63s != 16h,63s).
uhub0: 4 ports with 4 removable, self powered
Root mount waiting for: usbus1
Root mount waiting for: usbus1
uhub1: 4 ports with 4 removable, self powered
Trying to mount root from ufs:/dev/ad0s1a
The last log rotation is from 2016:
[root@roadkiller /var/log]# stat /var/log/wtmp      
65 61783 -rw-r--r-- 1 root wheel 208219 1056 "Nov  1 05:00:01 2016" "Jan 18 22:29:16 2017" "Jan 18 22:29:16 2017" "Nov  1 05:00:01 2016" 16384 4 0 /var/log/wtmp
Interestingly, I switched between eicat and teksavvy on December 11th. Which year? Who knows!
Dec 11 16:38:40 roadkiller mpd: [eicatL0] LCP: authorization successful
Dec 11 16:41:15 roadkiller mpd: [teksavvyL0] LCP: authorization successful
Never realized those good old logs had a "oh dear forgot the year" issue (that's something like Y2K except just "Y", I guess). That was probably 2015, because the log dates from 2017, and the last entry is from November of the year after the above:
[root@roadkiller /var/log]# stat mpd.log 
65 47113 -rw-r--r-- 1 root wheel 193008 71939195 "Jan 18 22:39:18 2017" "Jan 18 22:39:59 2017" "Jan 18 22:39:59 2017" "Apr  2 10:41:37 2013" 16384 140640 0 mpd.log
It looks like the system was installed in 2010:
[root@roadkiller /var/log]# stat /
63 2 drwxr-xr-x 21 root wheel 2120 512 "Jan 18 22:34:43 2017" "Jan 18 22:28:12 2017" "Jan 18 22:28:12 2017" "Jul 18 22:25:00 2010" 16384 4 0 /
... so it lived for about 6 years, but still works after almost 14 years, which I find utterly amazing. Another amazing thing is that there's tuptime installed on that server! That is a software I thought I discovered later and then sponsored in Debian, but turns out I was already using it then!
[root@roadkiller /var]# tuptime 
System startups:        19   since   21:20:16 11/07/15
System shutdowns:       0 ok   -   18 bad
System uptime:          85.93 %   -   1 year, 11 days, 10 hours, 3 minutes and 36 seconds
System downtime:        14.07 %   -   61 days, 15 hours, 22 minutes and 45 seconds
System life:            1 year, 73 days, 1 hour, 26 minutes and 20 seconds
Largest uptime:         122 days, 9 hours, 17 minutes and 6 seconds   from   08:17:56 02/02/16
Shortest uptime:        5 minutes and 4 seconds   from   21:55:00 01/18/17
Average uptime:         19 days, 19 hours, 28 minutes and 37 seconds
Largest downtime:       57 days, 1 hour, 9 minutes and 59 seconds   from   20:45:01 11/22/16
Shortest downtime:      -1 years, 364 days, 23 hours, 58 minutes and 12 seconds   from   22:30:01 01/18/17
Average downtime:       3 days, 5 hours, 51 minutes and 43 seconds
Current uptime:         18 minutes and 23 seconds   since   22:28:13 01/18/17
Actual up/down times:
[root@roadkiller /var]# tuptime -t
No.        Startup Date                                         Uptime       Shutdown Date   End                                                  Downtime
1     21:20:16 11/07/15      1 day, 0 hours, 40 minutes and 12 seconds   22:00:28 11/08/15   BAD                                  2 minutes and 37 seconds
2     22:03:05 11/08/15      1 day, 9 hours, 41 minutes and 57 seconds   07:45:02 11/10/15   BAD                                  3 minutes and 24 seconds
3     07:48:26 11/10/15    20 days, 2 hours, 41 minutes and 34 seconds   10:30:00 11/30/15   BAD                        4 hours, 50 minutes and 21 seconds
4     15:20:21 11/30/15                      19 minutes and 40 seconds   15:40:01 11/30/15   BAD                                   6 minutes and 5 seconds
5     15:46:06 11/30/15                      53 minutes and 55 seconds   16:40:01 11/30/15   BAD                           1 hour, 1 minute and 38 seconds
6     17:41:39 11/30/15     6 days, 16 hours, 3 minutes and 22 seconds   09:45:01 12/07/15   BAD                4 days, 6 hours, 53 minutes and 11 seconds
7     16:38:12 12/11/15   50 days, 17 hours, 56 minutes and 49 seconds   10:35:01 01/31/16   BAD                                 10 minutes and 52 seconds
8     10:45:53 01/31/16     1 day, 21 hours, 28 minutes and 16 seconds   08:14:09 02/02/16   BAD                                  3 minutes and 48 seconds
9     08:17:56 02/02/16    122 days, 9 hours, 17 minutes and 6 seconds   18:35:02 06/03/16   BAD                                 10 minutes and 16 seconds
10    18:45:18 06/03/16   29 days, 17 hours, 14 minutes and 43 seconds   12:00:01 07/03/16   BAD                                 12 minutes and 34 seconds
11    12:12:35 07/03/16   31 days, 17 hours, 17 minutes and 26 seconds   05:30:01 08/04/16   BAD                                 14 minutes and 25 seconds
12    05:44:26 08/04/16     15 days, 1 hour, 55 minutes and 35 seconds   07:40:01 08/19/16   BAD                                  6 minutes and 51 seconds
13    07:46:52 08/19/16     7 days, 5 hours, 23 minutes and 10 seconds   13:10:02 08/26/16   BAD                                  3 minutes and 45 seconds
14    13:13:47 08/26/16   27 days, 21 hours, 36 minutes and 14 seconds   10:50:01 09/23/16   BAD                                  2 minutes and 14 seconds
15    10:52:15 09/23/16   60 days, 10 hours, 52 minutes and 46 seconds   20:45:01 11/22/16   BAD                 57 days, 1 hour, 9 minutes and 59 seconds
16    21:55:00 01/18/17                        5 minutes and 4 seconds   22:00:04 01/18/17   BAD                                 11 minutes and 15 seconds
17    22:11:19 01/18/17                       8 minutes and 42 seconds   22:20:01 01/18/17   BAD                                   1 minute and 20 seconds
18    22:21:21 01/18/17                       8 minutes and 40 seconds   22:30:01 01/18/17   BAD   -1 years, 364 days, 23 hours, 58 minutes and 12 seconds
19    22:28:13 01/18/17                      20 minutes and 17 seconds
The last few entries are actually the tests I'm running now, it seems this machine thinks we're now on 2017-01-18 at ~22:00, while we're actually 2024-01-24 at ~12:00 local:
Wed Jan 18 23:05:38 EST 2017
FreeBSD/i386 (roadkiller.anarc.at) (ttyu0)
login: root
Password:
Jan 18 23:07:10 roadkiller login: ROOT LOGIN (root) ON ttyu0
Last login: Wed Jan 18 22:29:16 on ttyu0
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 8.4-RELEASE-p12 (ROADKILL) #5: Fri Jun  6 02:43:23 EDT 2014
Reminders:
 * commit stuff in /etc
 * reload firewall (in screen!):
    pfctl -f /etc/pf.conf ; sleep 1
 * vim + syn on makes pf.conf more readable
 * monitoring the PPPoE uplink:
   tail -f /var/log/mpd.log
Current problems:
 * sometimes pf doesn't start properly on boot, if pppoe failed to come up, use
   this to resume:
     /etc/rc.d/pf start
   it will kill your shell, but fix NAT (2012-08-10)
 * babel fails to start on boot (2013-06-15):
     babeld -D -g 33123 tap0 vr3
 * DNS often fails, tried messing with unbound.conf (2014-10-05) and updating
   named.root (2016-01-28) and performance tweaks (ee63689)
 * asterisk and mpd4 are deprecated and should be uninstalled when we're sure
   their replacements (voipms + ata and mpd5) are working (2015-01-13)
 * if IPv6 fails, it's because netblocks are not being routed upstream. DHCPcd
   should do this, but doesn't start properly, use this to resume (2015-12-21):
     /usr/local/sbin/dhcpcd -6 --persistent --background --timeout 0 -C resolv.conf ng0
This machine is doomed to be replaced with the new omnia router, Indiegogo
campaign should ship in april 2016: http://igg.me/at/turris-omnia/x
(I really like the motd I left myself there. In theory, I guess this could just start connecting to the internet again if I still had the same PPPoE/ADSL link I had almost a decade ago; obviously, I do not.) Not sure how the system figured the 2017 time: the onboard clock itself believes we're in 1980, so clearly the CMOS battery has (understandably) failed:
> ?
comBIOS Monitor Commands
boot [drive][:partition] INT19 Boot
reboot                   cold boot
download                 download a file using XMODEM/CRC
flashupdate              update flash BIOS with downloaded file
time [HH:MM:SS]          show or set time
date [YYYY/MM/DD]        show or set date
d[b w d] [adr]           dump memory bytes/words/dwords
e[b w d] adr value [...] enter bytes/words/dwords
i[b w d] port            input from 8/16/32-bit port
o[b w d] port value      output to 8/16/32-bit port
run adr                  execute code at adr
cmosread [adr]           read CMOS RAM data
cmoswrite adr byte [...] write CMOS RAM data
cmoschecksum             update CMOS RAM Checksum
set parameter=value      set system parameter to value
show [parameter]         show one or all system parameters
?/help                   show this help
> show
ConSpeed = 19200
ConLock = Enabled
ConMute = Disabled
BIOSentry = Enabled
PCIROMS = Enabled
PXEBoot = Enabled
FLASH = Primary
BootDelay = 5
FastBoot = Disabled
BootPartition = Disabled
BootDrive = 80 81 F0 FF 
ShowPCI = Enabled
Reset = Hard
CpuSpeed = Default
> time
Current Date and Time is: 1980/01/01 00:56:47
Another bit of archeology: I had documented various outages with my ISP... back in 2003!
[root@roadkiller ~/bin]# cat ppp_stats/downtimes.txt
11/03/2003 18:24:49 218
12/03/2003 09:10:49 118
12/03/2003 10:05:57 680
12/03/2003 10:14:50 106
12/03/2003 10:16:53 6
12/03/2003 10:35:28 146
12/03/2003 10:57:26 393
12/03/2003 11:16:35 5
12/03/2003 11:16:54 11
13/03/2003 06:15:57 18928
13/03/2003 09:43:36 9730
13/03/2003 10:47:10 23
13/03/2003 10:58:35 5
16/03/2003 01:32:36 338
16/03/2003 02:00:33 120
16/03/2003 11:14:31 14007
19/03/2003 00:56:27 11179
19/03/2003 00:56:43 5
19/03/2003 00:56:53 0
19/03/2003 00:56:55 1
19/03/2003 00:57:09 1
19/03/2003 00:57:10 1
19/03/2003 00:57:24 1
19/03/2003 00:57:25 1
19/03/2003 00:57:39 1
19/03/2003 00:57:40 1
19/03/2003 00:57:44 3
19/03/2003 00:57:53 0
19/03/2003 00:57:55 0
19/03/2003 00:58:08 0
19/03/2003 00:58:10 0
19/03/2003 00:58:23 0
19/03/2003 00:58:25 0
19/03/2003 00:58:39 1
19/03/2003 00:58:42 2
19/03/2003 00:58:58 5
19/03/2003 00:59:35 2
19/03/2003 00:59:47 3
19/03/2003 01:00:34 3
19/03/2003 01:00:39 0
19/03/2003 01:00:54 0
19/03/2003 01:01:11 2
19/03/2003 01:01:25 1
19/03/2003 01:01:48 1
19/03/2003 01:02:03 1
19/03/2003 01:02:10 2
19/03/2003 01:02:20 3
19/03/2003 01:02:44 3
19/03/2003 01:03:45 3
19/03/2003 01:04:39 2
19/03/2003 01:05:40 2
19/03/2003 01:06:35 2
19/03/2003 01:07:36 2
19/03/2003 01:08:31 2
19/03/2003 01:08:38 2
19/03/2003 01:10:07 3
19/03/2003 01:11:05 2
19/03/2003 01:12:03 3
19/03/2003 01:13:01 3
19/03/2003 01:13:58 2
19/03/2003 01:14:59 5
19/03/2003 01:15:54 2
19/03/2003 01:16:55 2
19/03/2003 01:17:50 2
19/03/2003 01:18:51 3
19/03/2003 01:19:46 2
19/03/2003 01:20:46 2
19/03/2003 01:21:42 3
19/03/2003 01:22:42 3
19/03/2003 01:23:37 2
19/03/2003 01:24:38 3
19/03/2003 01:25:33 2
19/03/2003 01:26:33 2
19/03/2003 01:27:30 3
19/03/2003 01:28:55 2
19/03/2003 01:29:56 2
19/03/2003 01:30:50 2
19/03/2003 01:31:42 3
19/03/2003 01:32:36 3
19/03/2003 01:33:27 2
19/03/2003 01:34:21 2
19/03/2003 01:35:22 2
19/03/2003 01:36:17 3
19/03/2003 01:37:18 2
19/03/2003 01:38:13 3
19/03/2003 01:39:39 2
19/03/2003 01:40:39 2
19/03/2003 01:41:35 3
19/03/2003 01:42:35 3
19/03/2003 01:43:31 3
19/03/2003 01:44:31 3
19/03/2003 01:45:53 3
19/03/2003 01:46:48 3
19/03/2003 01:47:48 2
19/03/2003 01:48:44 3
19/03/2003 01:49:44 2
19/03/2003 01:50:40 3
19/03/2003 01:51:39 1
19/03/2003 11:04:33 19   
19/03/2003 18:39:36 2833 
19/03/2003 18:54:05 825  
19/03/2003 19:04:00 454  
19/03/2003 19:08:11 210  
19/03/2003 19:41:44 272  
19/03/2003 21:18:41 208  
24/03/2003 04:51:16 6
27/03/2003 04:51:20 5
30/03/2003 04:51:25 5
31/03/2003 08:30:31 255  
03/04/2003 08:30:36 5
06/04/2003 01:16:00 621  
06/04/2003 22:18:08 17   
06/04/2003 22:32:44 13   
09/04/2003 22:33:12 28   
12/04/2003 22:33:17 6
15/04/2003 22:33:22 5
17/04/2003 15:03:43 18   
20/04/2003 15:03:48 5
23/04/2003 15:04:04 16   
23/04/2003 21:08:30 339  
23/04/2003 21:18:08 13   
23/04/2003 23:34:20 253  
26/04/2003 23:34:45 25   
29/04/2003 23:34:49 5
02/05/2003 13:10:01 185  
05/05/2003 13:10:06 5
08/05/2003 13:10:11 5
09/05/2003 14:00:36 63928
09/05/2003 16:58:52 2
11/05/2003 23:08:48 2
14/05/2003 23:08:53 6
17/05/2003 23:08:58 5
20/05/2003 23:09:03 5
23/05/2003 23:09:08 5
26/05/2003 23:09:14 5
29/05/2003 23:00:10 3
29/05/2003 23:03:01 10   
01/06/2003 23:03:05 4
04/06/2003 23:03:10 5
07/06/2003 23:03:38 28   
10/06/2003 23:03:50 12   
13/06/2003 23:03:55 6
14/06/2003 07:42:20 3
14/06/2003 14:37:08 3
15/06/2003 20:08:34 3
18/06/2003 20:08:39 6
21/06/2003 20:08:45 6
22/06/2003 03:05:19 138  
22/06/2003 04:06:28 3
25/06/2003 04:06:58 31   
28/06/2003 04:07:02 4
01/07/2003 04:07:06 4
04/07/2003 04:07:11 5
07/07/2003 04:07:16 5
12/07/2003 04:55:20 6
12/07/2003 19:09:51 1158 
12/07/2003 22:14:49 8025 
15/07/2003 22:14:54 6
16/07/2003 05:43:06 18   
19/07/2003 05:43:12 6
22/07/2003 05:43:17 5
23/07/2003 18:18:55 183  
23/07/2003 18:19:55 9
23/07/2003 18:29:15 158  
23/07/2003 19:48:44 4604 
23/07/2003 20:16:27 3
23/07/2003 20:37:29 1079 
23/07/2003 20:43:12 342  
23/07/2003 22:25:51 6158
Fascinating. I suspect the (IDE!) hard drive might be failing as I saw two new files created in /var that I didn't remember seeing before:
-rw-r--r--   1 root    wheel        0 Jan 18 22:55 3@T3
-rw-r--r--   1 root    wheel        0 Jan 18 22:55 DY5
So I shutdown the machine, possibly for the last time:
Waiting (max 60 seconds) for system process  bufdaemon' to stop...done
Waiting (max 60 seconds) for system process  syncer' to stop...
Syncing disks, vnodes remaining...3 3 0 1 1 0 0 done
All buffers synced.
Uptime: 36m43s
usbus0: Controller shutdown
uhub0: at usbus0, port 1, addr 1 (disconnected)
usbus0: Controller shutdown complete
usbus1: Controller shutdown
uhub1: at usbus1, port 1, addr 1 (disconnected)
usbus1: Controller shutdown complete
The operating system has halted.
Please press any key to reboot.
I'll finally note this was the last FreeBSD server I personally operated. I also used FreeBSD to setup the core routers at Koumbit but those were replaced with Debian recently as well. Thanks Soekris, that was some sturdy hardware. Hopefully this new Protectli router will live up to that "decade plus" challenge. Not sure what the fate of this device will be: I'll bring it to the next Montreal Debian & Stuff to see if anyone's interested, contact me if you can't show up and want this thing.

28 January 2024

Niels Thykier: Annotating the Debian packaging directory

In my previous blog post Providing online reference documentation for debputy, I made a point about how debhelper documentation was suboptimal on account of being static rather than online. The thing is that debhelper is not alone in this problem space, even if it is a major contributor to the number of packaging files you have to to know about. If we look at the "competition" here such as Fedora and Arch Linux, they tend to only have one packaging file. While most Debian people will tell you a long list of cons about having one packaging file (such a Fedora's spec file being 3+ domain specific languages "mashed" into one file), one major advantage is that there is only "the one packaging file". You only need to remember where to find the documentation for one file, which is great when you are running on wetware with limited storage capacity. Which means as a newbie, you can dedicate less mental resources to tracking multiple files and how they interact and more effort understanding the "one file" at hand. I started by asking myself how can we in Debian make the packaging stack more accessible to newcomers? Spoiler alert, I dug myself into rabbit hole and ended up somewhere else than where I thought I was going. I started by wanting to scan the debian directory and annotate all files that I could with documentation links. The logic was that if debputy could do that for you, then you could spend more mental effort elsewhere. So I combined debputy's packager provided files detection with a static list of files and I quickly had a good starting point for debputy-based packages.
Adding (non-static) dpkg and debhelper files to the mix Now, I could have closed the topic here and said "Look, I did debputy files plus couple of super common files". But I decided to take it a bit further. I added support for handling some dpkg files like packager provided files (such as debian/substvars and debian/symbols). But even then, we all know that debhelper is the big hurdle and a major part of the omission... In another previous blog post (A new Debian package helper: debputy), I made a point about how debputy could list all auxiliary files while debhelper could not. This was exactly the kind of feature that I would need for this feature, if this feature was to cover debhelper. Now, I also remarked in that blog post that I was not willing to maintain such a list. Also, I may have ranted about static documentation being unhelpful for debhelper as it excludes third-party provided tooling. Fortunately, a recent update to dh_assistant had provided some basic plumbing for loading dh sequences. This meant that getting a list of all relevant commands for a source package was a lot easier than it used to be. Once you have a list of commands, it would be possible to check all of them for dh's NOOP PROMISE hints. In these hints, a command can assert it does nothing if a given pkgfile is not present. This lead to the new dh_assistant list-guessed-dh-config-files command that will list all declared pkgfiles and which helpers listed them. With this combined feature set in place, debputy could call dh_assistant to get a list of pkgfiles, pretend they were packager provided files and annotate those along with manpage for the relevant debhelper command. The exciting thing about letting debpputy resolve the pkgfiles is that debputy will resolve "named" files automatically (debhelper tools will only do so when --name is passed), so it is much more likely to detect named pkgfiles correctly too. Side note: I am going to ignore the elephant in the room for now, which is dh_installsystemd and its package@.service files and the wide-spread use of debian/foo.service where there is no package called foo. For the latter case, the "proper" name would be debian/pkg.foo.service. With the new dh_assistant feature done and added to debputy, debputy could now detect the ubiquitous debian/install file. Excellent. But less great was that the very common debian/docs file was not. Turns out that dh_installdocs cannot be skipped by dh, so it cannot have NOOP PROMISE hints. Meh... Well, dh_assistant could learn about a new INTROSPECTABLE marker in addition to the NOOP PROMISE and then I could sprinkle that into a few commands. Indeed that worked and meant that debian/postinst (etc.) are now also detectable. At this point, debputy would be able to identify a wide range of debhelper related configuration files in debian/ and at least associate each of them with one or more commands. Nice, surely, this would be a good place to stop, right...?
Adding more metadata to the files The debhelper detected files only had a command name and manpage URI to that command. It would be nice if we could contextualize this a bit more. Like is this file installed into the package as is like debian/pam or is it a file list to be processed like debian/install. To make this distinction, I could add the most common debhelper file types to my static list and then merge the result together. Except, I do not want to maintain a full list in debputy. Fortunately, debputy has a quite extensible plugin infrastructure, so added a new plugin feature to provide this kind of detail and now I can outsource the problem! I split my definitions into two and placed the generic ones in the debputy-documentation plugin and moved the debhelper related ones to debhelper-documentation. Additionally, third-party dh addons could provide their own debputy plugin to add context to their configuration files. So, this gave birth file categories and configuration features, which described each file on different fronts. As an example, debian/gbp.conf could be tagged as a maint-config to signal that it is not directly related to the package build but more of a tool or style preference file. On the other hand, debian/install and debian/debputy.manifest would both be tagged as a pkg-helper-config. Files like debian/pam were tagged as ppf-file for packager provided file and so on. I mentioned configuration features above and those were added because, I have had a beef with debhelper's "standard" configuration file format as read by filearray and filedoublearray. They are often considered simple to understand, but it is hard to know how a tool will actually read the file. As an example, consider the following:
  • Will the debhelper use filearray, filedoublearray or none of them to read the file? This topic has about 2 bits of entropy.
  • Will the config file be executed if it is marked executable assuming you are using the right compat level? If it is executable, does dh-exec allow renaming for this file? This topic adds 1 or 2 bit of entropy depending on the context.
  • Will the config file be subject to glob expansions? This topic sounds like a boolean but is a complicated mess. The globs can be handled either by debhelper as it parses the file for you. In this case, the globs are applied to every token. However, this is not what dh_install does. Here the last token on each line is supposed to be a directory and therefore not subject to globs. Therefore, dh_install does the globbing itself afterwards but only on part of the tokens. So that is about 2 bits of entropy more. Actually, it gets worse...
    • If the file is executed, debhelper will refuse to expand globs in the output of the command, which was a deliberate design choice by the original debhelper maintainer took when he introduced the feature in debhelper/8.9.12. Except, dh_install feature interacts with the design choice and does enable glob expansion in the tool output, because it does so manually after its filedoublearray call.
So these "simple" files have way too many combinations of how they can be interpreted. I figured it would be helpful if debputy could highlight these difference, so I added support for those as well. Accordingly, debian/install is tagged with multiple tags including dh-executable-config and dh-glob-after-execute. Then, I added a datatable of these tags, so it would be easy for people to look up what they meant. Ok, this seems like a closed deal, right...?
Context, context, context However, the dh-executable-config tag among other are only applicable in compat 9 or later. It does not seem newbie friendly if you are told that this feature exist, but then have to read in the extended description that that it actually does not apply to your package. This problem seems fixable. Thanks to dh_assistant, it is easy to figure out which compat level the package is using. Then tweak some metadata to enable per compat level rules. With that tags like dh-executable-config only appears for packages using compat 9 or later. Also, debputy should be able to tell you where packager provided files like debian/pam are installed. We already have the logic for packager provided files that debputy supports and I am already using debputy engine for detecting the files. If only the plugin provided metadata gave me the install pattern, debputy would be able tell you where this file goes in the package. Indeed, a bit of tweaking later and setting install-pattern to usr/lib/pam.d/ name , debputy presented me with the correct install-path with the package name placing the name placeholder. Now, I have been using debian/pam as an example, because debian/pam is installed into usr/lib/pam.d in compat 14. But in earlier compat levels, it was installed into etc/pam.d. Well, I already had an infrastructure for doing compat file tags. Off we go to add install-pattern to the complat level infrastructure and now changing the compat level would change the path. Great. (Bug warning: The value is off-by-one in the current version of debhelper. This is fixed in git) Also, while we are in this install-pattern business, a number of debhelper config files causes files to be installed into a fixed directory. Like debian/docs which causes file to be installed into /usr/share/docs/ package . Surely, we can expand that as well and provide that bit of context too... and done. (Bug warning: The code currently does not account for the main documentation package context) It is rather common pattern for people to do debian/foo.in files, because they want to custom generation of debian/foo. Which means if you have debian/foo you get "Oh, let me tell you about debian/foo ". Then you rename it to debian/foo.in and the result is "debian/foo.in is a total mystery to me!". That is suboptimal, so lets detect those as well as if they were the original file but add a tag saying that they are a generate template and which file we suspect it generates. Finally, if you use debputy, almost all of the standard debhelper commands are removed from the sequence, since debputy replaces them. It would be weird if these commands still contributed configuration files when they are not actually going to be invoked. This mostly happened naturally due to the way the underlying dh_assistant command works. However, any file mentioned by the debhelper-documentation plugin would still appear unfortunately. So off I went to filter the list of known configuration files against which dh_ commands that dh_assistant thought would be used for this package.
Wrapping it up I was several layers into this and had to dig myself out. I have ended up with a lot of data and metadata. But it was quite difficult for me to arrange the output in a user friendly manner. However, all this data did seem like it would be useful any tool that wants to understand more about the package. So to get out of the rabbit hole, I for now wrapped all of this into JSON and now we have a debputy tool-support annotate-debian-directory command that might be useful for other tools. To try it out, you can try the following demo: In another day, I will figure out how to structure this output so it is useful for non-machine consumers. Suggestions are welcome. :)
Limitations of the approach As a closing remark, I should probably remind people that this feature relies heavily on declarative features. These include:
  • When determining which commands are relevant, using Build-Depends: dh-sequence-foo is much more reliable than configuring it via the Turing complete configuration we call debian/rules.
  • When debhelper commands use NOOP promise hints, dh_assistant can "see" the config files listed those hints, meaning the file will at least be detected. For new introspectable hint and the debputy plugin, it is probably better to wait until the dust settles a bit before adding any of those.
You can help yourself and others to better results by using the declarative way rather than using debian/rules, which is the bane of all introspection!

25 January 2024

Dimitri John Ledkov: Ubuntu Livepatch service now supports over 60 different kernels

Linux kernel getting a livepatch whilst running a marathon. Generated with AI.
Livepatch service eliminates the need for unplanned maintenance windows for high and critical severity kernel vulnerabilities by patching the Linux kernel while the system runs. Originally the service launched in 2016 with just a single kernel flavour supported.Over the years, additional kernels were added: new LTS releases, ESM kernels, Public Cloud kernels, and most recently HWE kernels too.Recently livepatch support was expanded for FIPS compliant kernels, Public cloud FIPS compliant kernels, and as well IBM Z (mainframe) kernels. Bringing the total of kernel flavours support to over 60 distinct kernel flavours supported in parallel. The table of supported kernels in the documentation lists the supported kernel flavours ABIs, the duration of individual build's support window, supported architectures, and the Ubuntu release. This work was only possible thanks to the collaboration with the Ubuntu Certified Public Cloud team, engineers at IBM for IBM Z (s390x) support, Ubuntu Pro team, Livepatch server & client teams.It is a great milestone, and I personally enjoy seeing the non-intrusive popup on my Ubuntu Desktop that a kernel livepatch was applied to my running system. I do enable Ubuntu Pro on my personal laptop thanks to the free Ubuntu Pro subscription for individuals.What's next? The next frontier is supporting ARM64 kernels. The Canonical kernel team has completed the gap analysis to start supporting Livepatch Service for ARM64. Upstream Linux requires development work on the consistency model to fully support livepatch on ARM64 processors. Livepatch code changes are applied on a per-task basis, when the task is deemed safe to switch over. This safety check depends mostly on kernel stacktraces. For these checks, CONFIG_HAVE_RELIABLE_STACKTRACE needs to be available in the upstream ARM64 kernel. (see The Linux Kernel Documentation). There are preliminary patches that enable reliable stacktraces on ARM64, however these turned out to be problematic as there are lots of fix revisions that came after the initial patchset that AWS ships with 5.10. This is a call for help from any interested parties. If you have engineering resources and are interested in bringing Livepatch Service to your ARM64 platforms, please reach out to the Canonical Kernel team on the public Ubuntu Matrix, Discourse, and mailing list. If you want to chat in person, see you at FOSDEM next weekend.

22 January 2024

Paul Tagliamonte: Writing a simulator to check phased array beamforming

Interested in future updates? Follow me on mastodon at @paul@soylent.green. Posts about hz.tools will be tagged #hztools.

If you're on the Fediverse, I'd very much appreciate boosts on my toot!
While working on hz.tools, I started to move my beamforming code from 2-D (meaning, beamforming to some specific angle on the X-Y plane for waves on the X-Y plane) to 3-D. I ll have more to say about that once I get around to publishing the code as soon as I m sure it s not completely wrong, but in the meantime I decided to write a simple simulator to visually check the beamformer against the textbooks. The results were pretty rad, so I figured I d throw together a post since it s interesting all on its own outside of beamforming as a general topic. I figured I d write this in Rust, since I ve been using Rust as my primary language over at zoo, and it s a good chance to learn the language better.
This post has some large GIFs

It make take a little bit to load depending on your internet connection. Sorry about that, I'm not clever enough to do better without doing tons of complex engineering work. They may be choppy while they load or something. I tried to compress an ensmall them, so if they're loaded but fuzzy, click on them to load a slightly larger version.
This post won t cover the basics of how phased arrays work or the specifics of calculating the phase offsets for each antenna, but I ll dig into how I wrote a simple simulator and how I wound up checking my phase offsets to generate the renders below.

Assumptions I didn t want to build a general purpose RF simulator, anything particularly generic, or something that would solve for any more than the things right in front of me. To do this as simply (and quickly all this code took about a day to write, including the beamforming math) I had to reduce the amount of work in front of me. Given that I was concerend with visualizing what the antenna pattern would look like in 3-D given some antenna geometry, operating frequency and configured beam, I made the following assumptions: All anetnnas are perfectly isotropic they receive a signal that is exactly the same strength no matter what direction the signal originates from. There s a single point-source isotropic emitter in the far-field (I modeled this as being 1 million meters away 1000 kilometers) of the antenna system. There is no noise, multipath, loss or distortion in the signal as it travels through space. Antennas will never interfere with each other.

2-D Polar Plots The last time I wrote something like this, I generated 2-D GIFs which show a radiation pattern, not unlike the polar plots you d see on a microphone. These are handy because it lets you visualize what the directionality of the antenna looks like, as well as in what direction emissions are captured, and in what directions emissions are nulled out. You can see these plots on spec sheets for antennas in both 2-D and 3-D form. Now, let s port the 2-D approach to 3-D and see how well it works out.

Writing the 3-D simulator As an EM wave travels through free space, the place at which you sample the wave controls that phase you observe at each time-step. This means, assuming perfectly synchronized clocks, a transmitter and receiver exactly one RF wavelength apart will observe a signal in-phase, but a transmitter and receiver a half wavelength apart will observe a signal 180 degrees out of phase. This means that if we take the distance between our point-source and antenna element, divide it by the wavelength, we can use the fractional part of the resulting number to determine the phase observed. If we multiply that number (in the range of 0 to just under 1) by tau, we can generate a complex number by taking the cos and sin of the multiplied phase (in the range of 0 to tau), assuming the transmitter is emitting a carrier wave at a static amplitude and all clocks are in perfect sync.
 let observed_phases: Vec<Complex> = antennas
.iter()
.map( antenna   
let distance = (antenna - tx).magnitude();
let distance = distance - (distance as i64 as f64);
((distance / wavelength) * TAU)
 )
.map( phase  Complex(phase.cos(), phase.sin()))
.collect();
At this point, given some synthetic transmission point and each antenna, we know what the expected complex sample would be at each antenna. At this point, we can adjust the phase of each antenna according to the beamforming phase offset configuration, and add up every sample in order to determine what the entire system would collectively produce a sample as.
 let beamformed_phases: Vec<Complex> = ...;
let magnitude = beamformed_phases
.iter()
.zip(observed_phases.iter())
.map( (beamformed, observed)  observed * beamformed)
.reduce( acc, el  acc + el)
.unwrap()
.abs();
Armed with this information, it s straight forward to generate some number of (Azimuth, Elevation) points to sample, generate a transmission point far away in that direction, resolve what the resulting Complex sample would be, take its magnitude, and use that to create an (x, y, z) point at (azimuth, elevation, magnitude). The color attached two that point is based on its distance from (0, 0, 0). I opted to use the Life Aquatic table for this one. After this process is complete, I have a point cloud of ((x, y, z), (r, g, b)) points. I wrote a small program using kiss3d to render point cloud using tons of small spheres, and write out the frames to a set of PNGs, which get compiled into a GIF. Now for the fun part, let s take a look at some radiation patterns!

1x4 Phased Array The first configuration is a phased array where all the elements are in perfect alignment on the y and z axis, and separated by some offset in the x axis. This configuration can sweep 180 degrees (not the full 360), but can t be steared in elevation at all. Let s take a look at what this looks like for a well constructed 1x4 phased array: And now let s take a look at the renders as we play with the configuration of this array and make sure things look right. Our initial quarter-wavelength spacing is very effective and has some outstanding performance characteristics. Let s check to see that everything looks right as a first test. Nice. Looks perfect. When pointing forward at (0, 0), we d expect to see a torus, which we do. As we sweep between 0 and 360, astute observers will notice the pattern is mirrored along the axis of the antennas, when the beam is facing forward to 0 degrees, it ll also receive at 180 degrees just as strong. There s a small sidelobe that forms when it s configured along the array, but it also becomes the most directional, and the sidelobes remain fairly small.

Long compared to the wavelength (1 ) Let s try again, but rather than spacing each antenna of a wavelength apart, let s see about spacing each antenna 1 of a wavelength apart instead. The main lobe is a lot more narrow (not a bad thing!), but some significant sidelobes have formed (not ideal). This can cause a lot of confusion when doing things that require a lot of directional resolution unless they re compensated for.

Going from ( to 5 ) The last model begs the question - what do things look like when you separate the antennas from each other but without moving the beam? Let s simulate moving our antennas but not adjusting the configured beam or operating frequency. Very cool. As the spacing becomes longer in relation to the operating frequency, we can see the sidelobes start to form out of the end of the antenna system.

2x2 Phased Array The second configuration I want to try is a phased array where the elements are in perfect alignment on the z axis, and separated by a fixed offset in either the x or y axis by their neighbor, forming a square when viewed along the x/y axis. Let s take a look at what this looks like for a well constructed 2x2 phased array: Let s do the same as above and take a look at the renders as we play with the configuration of this array and see what things look like. This configuration should suppress the sidelobes and give us good performance, and even give us some amount of control in elevation while we re at it. Sweet. Heck yeah. The array is quite directional in the configured direction, and can even sweep a little bit in elevation, a definite improvement from the 1x4 above.

Long compared to the wavelength (1 ) Let s do the same thing as the 1x4 and take a look at what happens when the distance between elements is long compared to the frequency of operation say, 1 of a wavelength apart? What happens to the sidelobes given this spacing when the frequency of operation is much different than the physical geometry? Mesmerising. This is my favorate render. The sidelobes are very fun to watch come in and out of existence. It looks absolutely other-worldly.

Going from ( to 5 ) Finally, for completeness' sake, what do things look like when you separate the antennas from each other just as we did with the 1x4? Let s simulate moving our antennas but not adjusting the configured beam or operating frequency. Very very cool. The sidelobes wind up turning the very blobby cardioid into an electromagnetic dog toy. I think we ve proven to ourselves that using a phased array much outside its designed frequency of operation seems like a real bad idea.

Future Work Now that I have a system to test things out, I m a bit more confident that my beamforming code is close to right! I d love to push that code over the line and blog about it, since it s a really interesting topic on its own. Once I m sure the code involved isn t full of lies, I ll put it up on the hztools org, and post about it here and on mastodon.

21 January 2024

Debian Brasil: MiniDebConf BH 2024 - patroc nio e financiamento coletivo

MiniDebConf BH 2024 J est rolando a inscri o de participante e a chamada de atividades para a MiniDebConf Belo Horizonte 2024, que acontecer de 27 a 30 de abril no Campus Pampulha da UFMG. Este ano estamos ofertando bolsas de alimenta o, hospedagem e passagens para contribuidores(as) ativos(as) do Projeto Debian. Patroc nio: Para a realiza o da MiniDebConf, estamos buscando patroc nio financeiro de empresas e entidades. Ent o se voc trabalha em uma empresa/entidade (ou conhece algu m que trabalha em uma) indique o nosso plano de patroc nio para ela. L voc ver os valores de cada cota e os seus benef cios. Financiamento coletivo: Mas voc tamb m pode ajudar a realiza o da MiniDebConf por meio do nosso financiamento coletivo! Fa a uma doa o de qualquer valor e tenha o seu nome publicado no site do evento como apoiador(a) da MiniDebConf Belo Horizonte 2024. Mesmo que voc n o pretenda vir a Belo Horizonte para participar do evento, voc pode doar e assim contribuir para o mais importante evento do Projeto Debian no Brasil. Contato Qualquer d vida, mande um email para contato@debianbrasil.org.br Organiza o Debian Brasil Debian Debian MG DCC

20 January 2024

Niels Thykier: Making debputy: Writing declarative parsing logic

In this blog post, I will cover how debputy parses its manifest and the conceptual improvements I did to make parsing of the manifest easier. All instructions to debputy are provided via the debian/debputy.manifest file and said manifest is written in the YAML format. After the YAML parser has read the basic file structure, debputy does another pass over the data to extract the information from the basic structure. As an example, the following YAML file:
manifest-version: "0.1"
installations:
  - install:
      source: foo
      dest-dir: usr/bin
would be transformed by the YAML parser into a structure resembling:
 
  "manifest-version": "0.1",
  "installations": [
      
       "install":  
         "source": "foo",
         "dest-dir": "usr/bin",
        
      
  ]
 
This structure is then what debputy does a pass on to translate this into an even higher level format where the "install" part is translated into an InstallRule. In the original prototype of debputy, I would hand-write functions to extract the data that should be transformed into the internal in-memory high level format. However, it was quite tedious. Especially because I wanted to catch every possible error condition and report "You are missing the required field X at Y" rather than the opaque KeyError: X message that would have been the default. Beyond being tedious, it was also quite error prone. As an example, in debputy/0.1.4 I added support for the install rule and you should allegedly have been able to add a dest-dir: or an as: inside it. Except I crewed up the code and debputy was attempting to look up these keywords from a dict that could never have them. Hand-writing these parsers were so annoying that it demotivated me from making manifest related changes to debputy simply because I did not want to code the parsing logic. When I got this realization, I figured I had to solve this problem better. While reflecting on this, I also considered that I eventually wanted plugins to be able to add vocabulary to the manifest. If the API was "provide a callback to extract the details of whatever the user provided here", then the result would be bad.
  1. Most plugins would probably throw KeyError: X or ValueError style errors for quite a while. Worst case, they would end on my table because the user would have a hard time telling where debputy ends and where the plugins starts. "Best" case, I would teach debputy to say "This poor error message was brought to you by plugin foo. Go complain to them". Either way, it would be a bad user experience.
  2. This even assumes plugin providers would actually bother writing manifest parsing code. If it is that difficult, then just providing a custom file in debian might tempt plugin providers and that would undermine the idea of having the manifest be the sole input for debputy.
So beyond me being unsatisfied with the current situation, it was also clear to me that I needed to come up with a better solution if I wanted externally provided plugins for debputy. To put a bit more perspective on what I expected from the end result:
  1. It had to cover as many parsing errors as possible. An error case this code would handle for you, would be an error where I could ensure it sufficient degree of detail and context for the user.
  2. It should be type-safe / provide typing support such that IDEs/mypy could help you when you work on the parsed result.
  3. It had to support "normalization" of the input, such as
           # User provides
           - install: "foo"
           # Which is normalized into:
           - install:
               source: "foo"
4) It must be simple to tell  debputy  what input you expected.
At this point, I remembered that I had seen a Python (PYPI) package where you could give it a TypedDict and an arbitrary input (Sadly, I do not remember the name). The package would then validate the said input against the TypedDict. If the match was successful, you would get the result back casted as the TypedDict. If the match was unsuccessful, the code would raise an error for you. Conceptually, this seemed to be a good starting point for where I wanted to be. Then I looked a bit on the normalization requirement (point 3). What is really going on here is that you have two "schemas" for the input. One is what the programmer will see (the normalized form) and the other is what the user can input (the manifest form). The problem is providing an automatic normalization from the user input to the simplified programmer structure. To expand a bit on the following example:
# User provides
- install: "foo"
# Which is normalized into:
- install:
    source: "foo"
Given that install has the attributes source, sources, dest-dir, as, into, and when, how exactly would you automatically normalize "foo" (str) into source: "foo"? Even if the code filtered by "type" for these attributes, you would end up with at least source, dest-dir, and as as candidates. Turns out that TypedDict actually got this covered. But the Python package was not going in this direction, so I parked it here and started looking into doing my own. At this point, I had a general idea of what I wanted. When defining an extension to the manifest, the plugin would provide debputy with one or two definitions of TypedDict. The first one would be the "parsed" or "target" format, which would be the normalized form that plugin provider wanted to work on. For this example, lets look at an earlier version of the install-examples rule:
# Example input matching this typed dict.
#    
#       "source": ["foo"]
#       "into": ["pkg"]
#    
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
In this form, the install-examples has two attributes - both are list of strings. On the flip side, what the user can input would look something like this:
# Example input matching this typed dict.
#    
#       "source": "foo"
#       "into": "pkg"
#    
#
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[str]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
FullInstallExamplesManifestFormat = Union[
    InstallExamplesManifestFormat,
    List[str],
    str,
]
The idea was that the plugin provider would use these two definitions to tell debputy how to parse install-examples. Pseudo-registration code could look something like:
def _handler(
    normalized_form: InstallExamplesTargetFormat,
) -> InstallRule:
    ...  # Do something with the normalized form and return an InstallRule.
concept_debputy_api.add_install_rule(
  keyword="install-examples",
  target_form=InstallExamplesTargetFormat,
  manifest_form=FullInstallExamplesManifestFormat,
  handler=_handler,
)
This was my conceptual target and while the current actual API ended up being slightly different, the core concept remains the same.
From concept to basic implementation Building this code is kind like swallowing an elephant. There was no way I would just sit down and write it from one end to the other. So the first prototype of this did not have all the features it has now. Spoiler warning, these next couple of sections will contain some Python typing details. When reading this, it might be helpful to know things such as Union[str, List[str]] being the Python type for either a str (string) or a List[str] (list of strings). If typing makes your head spin, these sections might less interesting for you. To build this required a lot of playing around with Python's introspection and typing APIs. My very first draft only had one "schema" (the normalized form) and had the following features:
  • Read TypedDict.__required_attributes__ and TypedDict.__optional_attributes__ to determine which attributes where present and which were required. This was used for reporting errors when the input did not match.
  • Read the types of the provided TypedDict, strip the Required / NotRequired markers and use basic isinstance checks based on the resulting type for str and List[str]. Again, used for reporting errors when the input did not match.
This prototype did not take a long (I remember it being within a day) and worked surprisingly well though with some poor error messages here and there. Now came the first challenge, adding the manifest format schema plus relevant normalization rules. The very first normalization I did was transforming into: Union[str, List[str]] into into: List[str]. At that time, source was not a separate attribute. Instead, sources was a Union[str, List[str]], so it was the only normalization I needed for all my use-cases at the time. There are two problems when writing a normalization. First is determining what the "source" type is, what the target type is and how they relate. The second is providing a runtime rule for normalizing from the manifest format into the target format. Keeping it simple, the runtime normalizer for Union[str, List[str]] -> List[str] was written as:
def normalize_into_list(x: Union[str, List[str]]) -> List[str]:
    return x if isinstance(x, list) else [x]
This basic form basically works for all types (assuming none of the types will have List[List[...]]). The logic for determining when this rule is applicable is slightly more involved. My current code is about 100 lines of Python code that would probably lose most of the casual readers. For the interested, you are looking for _union_narrowing in declarative_parser.py With this, when the manifest format had Union[str, List[str]] and the target format had List[str] the generated parser would silently map a string into a list of strings for the plugin provider. But with that in place, I had covered the basics of what I needed to get started. I was quite excited about this milestone of having my first keyword parsed without handwriting the parser logic (at the expense of writing a more generic parse-generator framework).
Adding the first parse hint With the basic implementation done, I looked at what to do next. As mentioned, at the time sources in the manifest format was Union[str, List[str]] and I considered to split into a source: str and a sources: List[str] on the manifest side while keeping the normalized form as sources: List[str]. I ended up committing to this change and that meant I had to solve the problem getting my parser generator to understand the situation:
# Map from
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[str]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
# ... into
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
There are two related problems to solve here:
  1. How will the parser generator understand that source should be normalized and then mapped into sources?
  2. Once that is solved, the parser generator has to understand that while source and sources are declared as NotRequired, they are part of a exactly one of rule (since sources in the target form is Required). This mainly came down to extra book keeping and an extra layer of validation once the previous step is solved.
While working on all of this type introspection for Python, I had noted the Annotated[X, ...] type. It is basically a fake type that enables you to attach metadata into the type system. A very random example:
# For all intents and purposes,  foo  is a string despite all the  Annotated  stuff.
foo: Annotated[str, "hello world"] = "my string here"
The exciting thing is that you can put arbitrary details into the type field and read it out again in your introspection code. Which meant, I could add "parse hints" into the type. Some "quick" prototyping later (a day or so), I got the following to work:
# Map from
#      
#        "source": "foo"  # (or "sources": ["foo"])
#        "into": "pkg"
#      
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[
        Annotated[
            str,
            DebputyParseHint.target_attribute("sources")
        ]
    ]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
# ... into
#      
#        "source": ["foo"]
#        "into": ["pkg"]
#      
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
Without me (as a plugin provider) writing a line of code, I can have debputy rename or "merge" attributes from the manifest form into the normalized form. Obviously, this required me (as the debputy maintainer) to write a lot code so other me and future plugin providers did not have to write it.
High level typing At this point, basic normalization between one mapping to another mapping form worked. But one thing irked me with these install rules. The into was a list of strings when the parser handed them over to me. However, I needed to map them to the actual BinaryPackage (for technical reasons). While I felt I was careful with my manual mapping, I knew this was exactly the kind of case where a busy programmer would skip the "is this a known package name" check and some user would typo their package resulting in an opaque KeyError: foo. Side note: "Some user" was me today and I was super glad to see debputy tell me that I had typoed a package name (I would have been more happy if I had remembered to use debputy check-manifest, so I did not have to wait through the upstream part of the build that happened before debhelper passed control to debputy...) I thought adding this feature would be simple enough. It basically needs two things:
  1. Conversion table where the parser generator can tell that BinaryPackage requires an input of str and a callback to map from str to BinaryPackage. (That is probably lie. I think the conversion table came later, but honestly I do remember and I am not digging into the git history for this one)
  2. At runtime, said callback needed access to the list of known packages, so it could resolve the provided string.
It was not super difficult given the existing infrastructure, but it did take some hours of coding and debugging. Additionally, I added a parse hint to support making the into conditional based on whether it was a single binary package. With this done, you could now write something like:
# Map from
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[
        Annotated[
            str,
            DebputyParseHint.target_attribute("sources")
        ]
    ]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[BinaryPackage, List[BinaryPackage]]
# ... into
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[
        Annotated[
            List[BinaryPackage],
            DebputyParseHint.required_when_multi_binary()
        ]
    ]
Code-wise, I still had to check for into being absent and providing a default for that case (that is still true in the current codebase - I will hopefully fix that eventually). But I now had less room for mistakes and a standardized error message when you misspell the package name, which was a plus.
The added side-effect - Introspection A lovely side-effect of all the parsing logic being provided to debputy in a declarative form was that the generated parser snippets had fields containing all expected attributes with their types, which attributes were required, etc. This meant that adding an introspection feature where you can ask debputy "What does an install rule look like?" was quite easy. The code base already knew all of this, so the "hard" part was resolving the input the to concrete rule and then rendering it to the user. I added this feature recently along with the ability to provide online documentation for parser rules. I covered that in more details in my blog post Providing online reference documentation for debputy in case you are interested. :)
Wrapping it up This was a short insight into how debputy parses your input. With this declarative technique:
  • The parser engine handles most of the error reporting meaning users get most of the errors in a standard format without the plugin provider having to spend any effort on it. There will be some effort in more complex cases. But the common cases are done for you.
  • It is easy to provide flexibility to users while avoiding having to write code to normalize the user input into a simplified programmer oriented format.
  • The parser handles mapping from basic types into higher forms for you. These days, we have high level types like FileSystemMode (either an octal or a symbolic mode), different kind of file system matches depending on whether globs should be performed, etc. These types includes their own validation and parsing rules that debputy handles for you.
  • Introspection and support for providing online reference documentation. Also, debputy checks that the provided attribute documentation covers all the attributes in the manifest form. If you add a new attribute, debputy will remind you if you forget to document it as well. :)
In this way everybody wins. Yes, writing this parser generator code was more enjoyable than writing the ad-hoc manual parsers it replaced. :)

Next.

Previous.